cs.CL(2026-02-24)
📊 共 15 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | Overton Pluralistic Reinforcement Learning for Large Language Models | 提出OP-GRPO,使大语言模型在无显式提示下生成多元化回复,提升观点覆盖率。 | reinforcement learning large language model | ||
| 10 | The Art of Efficient Reasoning: Data, Reward, and Optimization | 提出高效推理训练方法,通过数据、奖励和优化策略提升LLM推理效率。 | reinforcement learning reward shaping large language model | ||
| 11 | Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning | 提出Prompt-Level Distillation,无需微调即可高效推理,提升小模型性能。 | distillation chain-of-thought | ||
| 12 | Don't Ignore the Tail: Decoupling top-K Probabilities for Efficient Language Model Distillation | 提出解耦Top-K概率的蒸馏方法,提升语言模型蒸馏效率。 | distillation | ||
| 13 | CAMEL: Confidence-Gated Reflection for Reward Modeling | 提出CAMEL:一种置信度门控的自反思奖励建模框架,提升奖励模型的效率和准确性。 | reinforcement learning large language model | ||
| 14 | On Data Engineering for Scaling LLM Terminal Capabilities | 提出Terminal-Task-Gen和Terminal-Corpus,显著提升LLM在终端任务中的能力。 | curriculum learning large language model | ✅ | |
| 15 | Generative Pseudo-Labeling for Pre-Ranking with LLMs | 提出GPL框架,利用LLM生成伪标签解决预排序中的训练-服务偏差问题 | distillation large language model |