cs.LG（2025-12-23）

📊 共 3 篇论文

🎯 兴趣领域导航

#	题目	一句话要点	标签	🔗	⭐
1	Masking Teacher and Reinforcing Student for Distilling Vision-Language Models	提出Masters框架，通过掩码教师模型和强化学生模型，实现视觉-语言模型的有效蒸馏。	reinforcement learning offline RL distillation
2	Generalization of RLVR Using Causal Reasoning as a Testbed	利用因果推理作为测试平台，研究RLVR在复杂推理任务中的泛化能力	reinforcement learning large language model

#	题目	一句话要点	标签	🔗	⭐
3	Learning to Reason in LLMs by Expectation Maximization	提出基于期望最大化的LLM推理学习框架，优化生成合理化解释。	large language model