cs.LG(2025-03-25)
📊 共 14 篇论文 | 🔗 3 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (8 🔗3)
支柱二:RL算法与架构 (RL & Architecture) (5)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback | ExCoT:利用执行反馈优化Text-to-SQL的推理能力 | DPO direct preference optimization large language model | ||
| 10 | Beyond Verifiable Rewards: Scaling Reinforcement Learning for Language Models to Unverifiable Data | 提出JEPO算法,扩展强化学习至不可验证数据的语言模型训练 | reinforcement learning chain-of-thought | ||
| 11 | LERO: LLM-driven Evolutionary framework with Hybrid Rewards and Enhanced Observation for Multi-Agent Reinforcement Learning | LERO:基于LLM驱动的演化框架,通过混合奖励和增强观测提升多智能体强化学习性能 | reinforcement learning large language model | ||
| 12 | Abstracting Geo-specific Terrains to Scale Up Reinforcement Learning | 提出基于抽象地形的多智能体强化学习方法,加速军事仿真训练。 | reinforcement learning | ||
| 13 | Flow to Learn: Flow Matching on Neural Network Parameters | 提出FLoWN,通过流匹配学习生成神经网络参数,提升图像任务的元学习能力。 | flow matching |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 14 | Tensor Generalized Approximate Message Passing | 提出张量广义近似消息传递算法(TeG-AMP)用于低秩张量推断,解决张量补全和分解问题。 | AMP |