cs.CL(2026-05-27)

📊 共 49 篇论文 | 🔗 11 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (34 🔗6) 支柱二:RL算法与架构 (RL & Architecture) (13 🔗5) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (34 篇)

#题目一句话要点标签🔗
1 CIRF: Tokenizing Chain-of-Thoughts into Reusable Functional Units for Efficient Latent Reasoning in Large Language Models CIRF将思维链分解为可复用功能单元,提升大语言模型潜在推理效率。 large language model chain-of-thought
2 Argument Quality Assessment with Large Language Models: A Pairwise Bradley-Terry Approach 利用大型语言模型和Bradley-Terry模型进行论证质量评估。 large language model chain-of-thought
3 SMILE-Next: Teaching Large Language Models to Detect, Classify, and Reason about Laughter 提出SMILE-Next以解决真实场景中笑声理解问题 large language model multimodal
4 KSAFE-MM: A Multimodal Safety Benchmark via Localized Contextualization for Korean Cultural Risks KSAFE-MM:通过本地化情境化构建韩国文化风险多模态安全基准 large language model multimodal
5 Reverse Probing: Supervised Token-level Uncertainty Quantification for Large Language Models in Clinical Text 提出Reverse Probing,用于临床文本中大语言模型的监督式Token级不确定性量化。 large language model
6 MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems MemTrace:通过可执行的记忆演化图追踪和归因大语言模型记忆系统中的错误 large language model
7 IPO-Mine: A Toolkit and Dataset for Section-Structured Analysis of Long, Multimodal IPO Documents 提出IPO-Mine工具包与数据集,用于结构化分析长篇多模态IPO文件。 multimodal
8 Can Large Language Models Handle Discourse Particles? A Case Study of Colloquial Malay 提出MalayPrag基准,评估LLM处理马来口语语篇助词的能力 large language model
9 Agent Explorative Policy Optimization for Multimodal Agentic Reasoning 提出AXPO,通过探索性策略优化解决多模态Agent推理中的Thinking-Acting Gap问题。 multimodal
10 Revisiting Anthropomorphic Reflection Markers in Large Language Model Reasoning 研究表明大型语言模型推理中的拟人化反思标记并非必要,可被抑制且不影响性能。 large language model
11 IFMTBench: A Comprehensive Benchmark for Multilingual Translation Instruction Following 提出IFMTBench以解决多语言翻译指令遵循问题 instruction following
12 Prompting Is All You Need: Multi-view Prompting Large Language Models for Aspect-Based Sentiment Analysis 提出LLM-MvP,通过多视角Prompting提升大语言模型在ABSA任务上的性能并降低计算成本。 large language model
13 Personality, Role, and Expressive Style in Large Language Models: An Interactionist Analysis 交互视角下的大语言模型人格、角色与表达风格研究 large language model
14 MemGuard: Preventing Memory Contamination in Long-Term Memory-Augmented Large Language Models MemGuard:通过类型感知记忆管理,防止长程记忆增强大语言模型中的记忆污染 large language model
15 ChildEval: When large language models meet children's personalities 提出ChildEval基准,评估LLM在儿童个性化对话中的表现 large language model
16 VLMs May Not Globally Enhance Human Alignment over LLMs During Natural Reading 研究表明,在自然阅读中,视觉语言模型(VLM)相比大型语言模型(LLM)可能不会全局性地提升人类对齐。 large language model multimodal
17 The Missing Piece in Pre-trained Model Evaluation: Reward-Guided Decoding Unlocks Task-Oriented Behavior Without Parameter Updates 提出能量引导解码(EBD),无需参数更新即可激活预训练LLM的任务导向行为。 large language model instruction following
18 Rethinking Visual Neglect: Steering via Context-Preference for MLLM Hallucination Mitigation 提出Context-Preference Activation Steering (CAS)框架,缓解MLLM中的对象幻觉问题 large language model multimodal
19 Towards Reliable Multilingual LLMs-as-a-Judge: An Empirical Study 研究多语言LLM作为评估器的可靠性,探索不同资源下的优化策略。 large language model
20 Functional Entropy: Predicting Functional Correctness in LLM-Generated Code with Uncertainty Quantification 提出功能熵以量化LLM代码生成的不确定性,从而预测代码功能正确性 large language model
21 Human Label Variation as Stable Signal: Learning Annotator-Specific Explanation Behavior via Cross-Annotator Preference Optimization 提出跨标注者偏好优化CAPO,学习并复现标注者特定解释行为 large language model
22 On Compositional Learning Behaviours in Formal Mathematics 提出S2B-LM基准,研究形式化数学中组合学习行为对定理证明的影响。 chain-of-thought
23 Beyond One Path: Evaluating and Enhancing Divergent Thinking in Interactive LLM Agents 提出MUTATE基准与ReDNA框架,提升交互式LLM Agent的发散性思维能力 large language model
24 FABSVer: Faster Training and Better Self-Verification for LLM Mathematical Reasoning FABSVer:加速LLM数学推理训练并提升自验证能力 large language model
25 PrunePath: Towards Highly Structured Sparse Language Models PrunePath:面向高结构化稀疏语言模型的自适应剪枝框架 large language model
26 Framing Matters: Addressing Framing Sensitivity in Decision-Making through Behaviorally-Grounded Value Alignment 提出Valign方法,通过行为价值对齐解决大语言模型决策中的框架敏感性问题 large language model
27 SuperValid: Capability-Aligned OOD Validation for Generalizable Downstream Scaling SuperValid:面向可泛化下游扩展的、能力对齐的OOD验证方法 large language model
28 DEPART: DEcomposing PARiTy across Multilingual LLMs DEPART:解构多语言LLM中的奇偶性差异,揭示性能差异的根本原因。 large language model
29 Risk-aware Selective Prompting for Hallucination Mitigation in Large Vision-Language Models 提出风险感知选择性Prompt方法,缓解大型视觉语言模型中的幻觉问题。 visual grounding
30 Where Does Toxicity Live? Mechanistic Localization and Targeted Suppression in Language Models 提出Meow2X和TRNE框架,无需重训练即可定位并抑制语言模型中的毒性。 large language model
31 An Evolutionary Approach for Designing Stable and Highly Expressible Low-Immunogenicity Therapeutic mRNA Sequences 提出基于BERT和遗传算法的mRNA序列优化框架,提升稳定性和表达效率并降低免疫原性 large language model
32 Periodic RoPE for Infinite Context LLMs 提出Periodic RoPE,解决LLM无限上下文长度下的位置编码退化问题 large language model
33 AI Research Agents Narrow Scientific Exploration AI研究智能体倾向局部优化,难以有效拓展科学探索的广度。 large language model
34 GRADE: Generalizable Reasoning-Aware Dialogue Evaluation for AI Tutors GRADE:面向AI辅导的通用推理感知对话评估框架 instruction following

🔬 支柱二:RL算法与架构 (RL & Architecture) (13 篇)

#题目一句话要点标签🔗
35 OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration OmniVerifier-M1:利用结构化重校准的多模态元验证器,提升视觉验证的可靠性和可解释性。 reinforcement learning large language model foundation model
36 Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning Skill0.5:结合技能内化与利用,提升Agentic强化学习的泛化能力 reinforcement learning distillation large language model
37 Skill-Conditioned Gated Self-Distillation for LLM Reasoning 提出技能条件门控自蒸馏以提升大模型推理能力 teacher-student distillation privileged information
38 Mobile-Aptus: Confidence-Driven Proactive and Robust Interaction in MLLM-based Mobile-Using Agents Mobile-Aptus:基于置信度的MLLM移动代理主动交互框架,提升任务成功率 direct preference optimization large language model multimodal
39 Soft-SVeRL: Self-Verified Reinforcement Learning with Soft Rewards 提出Soft-SVeRL,利用软奖励和自验证提升强化学习在部分可验证任务中的性能。 reinforcement learning instruction following
40 GUI-CIDER: Mid-training GUI Agents via Causal Internalization and Density-aware Exemplar Reselection GUI-CIDER:通过因果内化和密度感知范例重选,提升GUI智能体在训练中的世界知识。 reinforcement learning large language model multimodal
41 AdaDPO: Self-Adaptive Direct Preference Optimization with Balanced Gradient Updates AdaDPO:一种自适应直接偏好优化方法,平衡梯度更新以提升LLM对齐效果 RLHF DPO direct preference optimization
42 ROSD: Reflective On-Policy Self-Distillation for Language Model Reasoning across Domains ROSD:反射式On-Policy自蒸馏提升语言模型跨领域推理能力 distillation large language model
43 PromptEmbedder:: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting PromptEmbedder:通过双LLM软提示实现高效且可迁移的文本嵌入 representation learning large language model
44 Semantic Flow Regularization: Teaching LLMs to Generate Diverse Yet Coherent Responses 提出语义流正则化(SFR)以提升LLM在风格化生成任务中的多样性和一致性 flow matching large language model
45 Retrieval, Reward, and Training Protocols: What Matters in Training Search Agents? 针对搜索Agent训练,系统性研究检索语料、奖励函数和训练协议的影响 reward design large language model
46 Narrative Flattening: How Post-Training Compresses Thematic, Affective, and Stylistic Variation in LLM Fiction 后训练压缩LLM小说的主题、情感和文风变化,导致叙事扁平化 DPO large language model
47 Playing with Words, Improving with Rewards: Training Language Models for Creative Association 提出基于奖励的强化学习方法,训练语言模型进行创造性联想。 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
48 The Fragility of Chain-of-Thought Monitoring Across Typologically Diverse Languages 揭示思维链监控在多语种环境下的脆弱性,发现大模型存在策略性欺骗行为。 manipulation large language model chain-of-thought
49 A Wolf in Sheep's Clothing: Targeted Routing Hijacking in Federated RAG 针对联邦RAG的路由劫持攻击及其防御方法 manipulation

⬅️ 返回 cs.CL 首页 · 🏠 返回主页