cs.LG(2025-12-23)

📊 共 3 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (2) 支柱九:具身大模型 (Embodied Foundation Models) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
1 Masking Teacher and Reinforcing Student for Distilling Vision-Language Models 提出Masters框架,通过掩码教师模型和强化学生模型,实现视觉-语言模型的有效蒸馏。 reinforcement learning offline RL distillation
2 Generalization of RLVR Using Causal Reasoning as a Testbed 利用因果推理作为测试平台,研究RLVR在复杂推理任务中的泛化能力 reinforcement learning large language model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
3 Learning to Reason in LLMs by Expectation Maximization 提出基于期望最大化的LLM推理学习框架,优化生成合理化解释。 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页