cs.CL(2025-04-04)
📊 共 33 篇论文 | 🔗 3 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (23 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (9 🔗1)
支柱六:视频提取与匹配 (Video Extraction) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (23 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 24 | Align to Structure: Aligning Large Language Models with Structural Information | 提出结构对齐方法,提升大型语言模型在长文本生成中的连贯性和结构性。 | reinforcement learning RLHF large language model | ✅ | |
| 25 | Algorithmic Prompt Generation for Diverse Human-like Teaming and Communication with Large Language Models | 提出基于质量多样性优化LLM提示的算法,用于生成多样化类人团队协作行为 | reinforcement learning large language model | ||
| 26 | Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning | 提出平衡在线难度过滤方法,提升面向推理的强化学习训练效率与性能 | reinforcement learning curriculum learning large language model | ||
| 27 | Learning Natural Language Constraints for Safe Reinforcement Learning of Language Agents | 提出基于自然语言约束的安全强化学习框架,提升语言Agent在真实场景中的安全性。 | reinforcement learning RLHF large language model | ||
| 28 | Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models | 提出Nemotron-H混合Mamba-Transformer模型,旨在提升推理效率并保持精度。 | Mamba distillation | ||
| 29 | Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward | 提出基于好奇心奖励的个性化多轮对话方法,提升LLM用户建模能力。 | reinforcement learning RLHF large language model | ||
| 30 | Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking | 提出结合知识蒸馏与强化学习的小模型训练方法,用于推理型文档重排序。 | reinforcement learning distillation | ||
| 31 | AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset | AIR框架:通过解耦偏好数据集的注释、指令和响应对,实现高效对齐。 | preference learning large language model | ||
| 32 | Sample, Don't Search: Rethinking Test-Time Alignment for Language Models | 提出QAlign,通过采样而非搜索优化语言模型在测试时的对齐问题。 | DPO direct preference optimization |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 33 | CliME: Evaluating Multimodal Climate Discourse on Social Media and the Climate Alignment Quotient (CAQ) | 提出CliME多模态气候数据集与CAQ评估指标,用于评估LLM在气候讨论中的表现。 | HuMoR large language model multimodal |