cs.CL(2026-02-06)
📊 共 27 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (16 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (10 🔗3)
支柱六:视频提取与匹配 (Video Extraction) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 17 | Evaluating an evidence-guided reinforcement learning framework in aligning light-parameter large language models with decision-making cognition in psychiatric clinical reasoning | ClinMPO:证据引导强化学习提升轻量级LLM在精神病学临床推理中的决策认知能力 | reinforcement learning large language model | ||
| 18 | FMBench: Adaptive Large Language Model Output Formatting | FMBench:自适应大语言模型Markdown格式化输出评测与优化 | reinforcement learning large language model instruction following | ✅ | |
| 19 | compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data | compar:IA:法国政府构建法语LLM评测平台,收集人类prompt和偏好数据 | reinforcement learning RLHF DPO | ||
| 20 | R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging | R-Align:通过以推理为中心的元判断增强生成式奖励模型 | reinforcement learning RLHF large language model | ||
| 21 | InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning | InftyThink+:通过强化学习实现高效无限视野推理 | reinforcement learning chain-of-thought | ||
| 22 | TrailBlazer: History-Guided Reinforcement Learning for Black-Box LLM Jailbreaking | 提出历史引导的强化学习框架以提升黑箱LLM越狱效率 | reinforcement learning large language model | ||
| 23 | SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks | 提出SEMA框架,通过自调优预填充和意图感知强化学习,有效提升多轮对抗攻击成功率。 | reinforcement learning DPO direct preference optimization | ✅ | |
| 24 | Can Post-Training Transform LLMs into Causal Reasoners? | 通过后训练将大语言模型转化为因果推理器 | PPO DPO large language model | ✅ | |
| 25 | Generating Data-Driven Reasoning Rubrics for Domain-Adaptive Reward Modeling | 提出数据驱动的推理评估准则,提升领域自适应奖励建模效果 | reinforcement learning large language model | ||
| 26 | Free Energy Mixer | 提出自由能混合器(FEM),通过值驱动的通道选择提升注意力机制性能。 | SSM linear attention |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 27 | On the Wings of Imagination: Conflicting Script-based Multi-role Framework for Humor Caption Generation | 提出基于冲突脚本的多角色框架HOMER,用于生成幽默的图像描述 | HuMoR large language model |