cs.CL(2025-05-29)
📊 共 29 篇论文 | 🔗 3 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (19 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (9 🔗1)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (19 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 20 | Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation | 提出主动层对比解码(ActLCD)以减少大语言模型生成中的幻觉问题 | reinforcement learning large language model | ||
| 21 | DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning | DeepTheorem:利用自然语言和强化学习提升LLM定理证明能力 | reinforcement learning IMoS large language model | ||
| 22 | Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation | 提出LoVeC:利用强化学习提升长文本生成中置信度表达的质量 | reinforcement learning DPO large language model | ||
| 23 | The Surprising Soupability of Documents in State Space Models | 提出文档混合(Document Souping)方法,提升状态空间模型在长文档推理中的性能。 | Mamba SSM state space model | ||
| 24 | ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering | 提出基于强化学习的LLM智能体ML-Agent,实现自主机器学习工程 | reinforcement learning large language model | ||
| 25 | LoLA: Low-Rank Linear Attention With Sparse Caching | LoLA:低秩线性注意力与稀疏缓存,提升终身学习中的关联记忆 | linear attention | ||
| 26 | Are Reasoning Models More Prone to Hallucination? | 研究表明,推理模型在事实性任务中可能更容易产生幻觉,但可通过特定训练流程缓解。 | distillation chain-of-thought | ||
| 27 | Act-Adaptive Margin: Dynamically Calibrating Reward Models for Subjective Ambiguity | 提出Act-Adaptive Margin (AAM)动态校准奖励模型,提升主观任务中奖励建模性能。 | reinforcement learning preference learning | ✅ | |
| 28 | Table-R1: Inference-Time Scaling for Table Reasoning | Table-R1:探索表格推理任务的推理时缩放技术,提升小模型性能。 | reinforcement learning distillation |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 29 | Hidden Persuasion: Detecting Manipulative Narratives on Social Media During the 2022 Russian Invasion of Ukraine | 针对乌克兰社交媒体操纵性叙事,提出基于Gemma 2和XLM-RoBERTa的检测方案。 | manipulation |