cs.CL(2025-01-22)
📊 共 17 篇论文 | 🔗 3 篇有代码
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 13 | Quantification of Large Language Model Distillation | 提出量化大语言模型蒸馏框架,评估模型同质化程度与身份认知偏差。 | distillation large language model | ✅ | |
| 14 | DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning | DeepSeek-R1:通过强化学习激励LLM的推理能力,无需人工标注 | reinforcement learning large language model chain-of-thought | ||
| 15 | OpenGenAlign: A Preference Dataset and Benchmark for Trustworthy Reward Modeling in Open-Ended, Long-Context Generation | 提出OpenGenAlign,用于开放域长文本生成中可信奖励建模的偏好数据集与基准。 | reinforcement learning large language model instruction following | ||
| 16 | Training Dialogue Systems by AI Feedback for Improving Overall Dialogue Impression | 利用AI反馈训练对话系统,提升整体对话体验 | reinforcement learning large language model | ||
| 17 | Extracting General-use Transformers for Low-resource Languages via Knowledge Distillation | 提出基于知识蒸馏的通用Transformer提取方法,用于低资源语言场景。 | distillation |