cs.AI(2025-09-29)
📊 共 28 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (19 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (8)
支柱三:空间感知与语义 (Perception & Semantics) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (19 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 20 | Training Agents Inside of Scalable World Models | Dreamer 4:通过可扩展世界模型在Minecraft中实现离线钻石获取 | reinforcement learning world model dreamer | ||
| 21 | Hybrid Reward Normalization for Process-supervised Non-verifiable Agentic Tasks | 提出PPR方法,通过混合奖励归一化提升Agent在非验证任务中的表现 | reinforcement learning large language model | ||
| 22 | Modeling Others' Minds as Code | ROTE:利用程序合成高效预测人类及AI行为,提升人机协作 | behavior cloning large language model | ||
| 23 | DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search | DeepSearch:通过蒙特卡洛树搜索和可验证奖励克服强化学习瓶颈 | reinforcement learning | ||
| 24 | The Era of Real-World Human Interaction: RL from User Conversations | 提出基于用户对话的强化学习(RLHI),实现个性化对齐和持续模型改进。 | reinforcement learning instruction following | ||
| 25 | Pushing LLMs to Their Logical Reasoning Bound: The Role of Data Reasoning Intensity | 提出数据推理强度(DRI)指标,优化训练数据以提升LLM逻辑推理能力。 | reinforcement learning large language model | ||
| 26 | Towards Safe Reasoning in Large Reasoning Models via Corrective Intervention | 提出Intervened Preference Optimization以提升大型推理模型安全性 | preference learning chain-of-thought | ||
| 27 | Learning to Interact in World Latent for Team Coordination | 提出交互世界隐空间(IWoL)框架,促进多智能体强化学习中的团队协作 | reinforcement learning representation learning |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 28 | Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs | 提出基于LLM中类比文本描述的视觉-语言导航方法,提升导航性能。 | scene understanding embodied AI VLN |