cs.CL(2025-07-29)
📊 共 21 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | Post-Training Large Language Models via Reinforcement Learning from Self-Feedback | 提出基于自反馈强化学习的LLM后训练方法,提升校准性和推理能力 | reinforcement learning RLHF large language model | ||
| 16 | RL from Teacher-Model Refinement: Gradual Imitation Learning for Machine Translation | 提出RLfR:通过教师模型精炼的强化学习用于机器翻译,提升语义质量和实体保持。 | reinforcement learning imitation learning preference learning | ||
| 17 | AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning | AutoTIR:通过强化学习实现自主工具集成推理 | reinforcement learning large language model | ✅ | |
| 18 | DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router | 提出DeepSieve,通过LLM作为知识路由器的信息筛选框架,提升RAG在复杂问答中的性能。 | distillation large language model | ✅ | |
| 19 | Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning | 提出Graph-R1框架以解决传统RAG方法的结构语义不足问题 | reinforcement learning | ||
| 20 | Libra: Assessing and Improving Reward Model by Learning to Think | 提出Libra框架,评估并提升奖励模型在复杂推理场景下的性能。 | reinforcement learning large language model | ||
| 21 | Multi-Hypothesis Distillation of Multilingual Neural Translation Models for Low-Resource Languages | 提出多假设蒸馏方法,提升低资源语言神经机器翻译模型性能 | distillation |