cs.LG(2024-09-04)

📊 共 14 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (9) 支柱九:具身大模型 (Embodied Foundation Models) (3) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
1 Large Language Models as Efficient Reward Function Searchers for Custom-Environment Multi-Objective Reinforcement Learning 提出ERFSL,利用大语言模型高效搜索多目标强化学习自定义环境下的奖励函数。 reinforcement learning large language model
2 Building Math Agents with Multi-Turn Iterative Preference Learning 提出多轮迭代偏好学习框架,提升数学Agent工具集成推理能力 preference learning DPO large language model
3 Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations 提出贡献感知多模态用户嵌入(CAMUE)框架,实现社交网络中个性化可解释的预测。 representation learning multimodal
4 Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal 提出Continual Diffuser (CoD),解决离线强化学习中的持续学习难题。 reinforcement learning offline reinforcement learning
5 An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning 介绍合作多智能体强化学习中集中训练分散执行方法 reinforcement learning
6 Unifying Causal Representation Learning with the Invariance Principle 统一因果表示学习与不变性原则,提升高维数据因果推断能力 representation learning
7 Tractable Offline Learning of Regular Decision Processes 提出新方法以克服离线强化学习中的RDP限制 reinforcement learning offline RL offline reinforcement learning
8 Independence Constrained Disentangled Representation Learning from Epistemological Perspective 提出一种基于知识论视角和双层潜在空间的解耦表示学习方法,提升可解释性和控制生成质量。 representation learning
9 Learning Privacy-Preserving Student Networks via Discriminative-Generative Distillation 提出一种判别-生成蒸馏方法,用于学习保护隐私的学生网络。 distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
10 Understanding eGFR Trajectories and Kidney Function Decline via Large Multimodal Models 利用大型多模态模型预测eGFR轨迹和肾功能下降 large language model foundation model multimodal
11 Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA 提出RoLoRA框架,通过交替最小化LoRA实现对联邦微调基础模型的鲁棒优化。 foundation model
12 Hallucination Detection in LLMs: Fast and Memory-Efficient Fine-Tuned Models 提出快速且内存高效的微调模型,用于检测大型语言模型中的幻觉 large language model

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
13 Reservoir Static Property Estimation Using Nearest-Neighbor Neural Network 提出基于最近邻神经网络的油藏静态属性估计方法,提升空间插值精度。 spatial relationship

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
14 Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling 揭示Masked Diffusion模型本质为时间无关的Masked模型,并指出其Categorical采样存在精度问题。 MDM

⬅️ 返回 cs.LG 首页 · 🏠 返回主页