cs.CL(2026-05-28)

📊 共 56 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (39 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (15 🔗2) 支柱一:机器人控制 (Robot Control) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (39 篇)

#题目一句话要点标签🔗
1 Towards Localized and Disentangled Knowledge Editing for Multimodal Large Language Models 提出LDKE框架,解决多模态大语言模型知识编辑的泛化性和局部性问题 large language model multimodal
2 LLMSurgeon: Diagnosing Data Mixture of Large Language Models LLMSurgeon:诊断大型语言模型预训练数据混合比例,实现事后溯源。 large language model foundation model
3 Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation 提出Ptah:一个多智能体框架,用于生成可验证的多模态深度研究报告 large language model multimodal
4 Unlocking the Working Memory of Large Language Models for Latent Reasoning 提出 Reasoning in Memory (RiM),利用大语言模型的工作记忆进行潜在推理 large language model
5 Latent Performance Profiling of Large Language Models 提出潜变量性能剖析(LPP)框架,用于从模型内部状态评估大语言模型。 large language model
6 ActTraitBench: Quantifying the Knowledge-Decision Gap in Large Language Models via Human-Grounded Behavioral Validation ActTraitBench:通过行为验证量化大语言模型中的知行差距 large language model
7 Spurious Prompts: Can Irrelevant Prompts Steer Large Language Models? 发现LLM对无关提示的敏感性:无关提示可有效引导模型行为 large language model
8 CCS: Clinical Consensus Selection for Radiology Report Generation 提出临床共识选择(CCS)框架,提升放射报告生成中推理阶段的报告质量。 large language model multimodal
9 How LoRA Remembers? A Parametric Memory Law for LLM Finetuning 提出参数记忆定律,量化LoRA微调中LLM的记忆容量,并提出MemFT优化策略。 large language model
10 Knowing What to Solve Before How: Preplan Empowered LLM Mathematical Reasoning PPC:通过预先规划增强LLM的数学推理能力 large language model
11 Causal Interventions on Continuous Variables: A Case Study on Verb Bias in Steering Vectors for In-Context Learning 提出连续变量因果干预方法,研究语言模型中动词偏向对上下文学习的影响 large language model
12 Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents PlanAhead框架评估LLM Web Agent中规划表示的影响,提升任务成功率。 multimodal
13 Nine Judges, Two Effective Votes: Correlated Errors Undermine LLM Evaluation Panels LLM评判团存在误差相关性,导致有效投票数远低于预期 chain-of-thought
14 DySem: Uncovering Dynamic Semantic Components via Multilingual Consensus for Calculating Semantic Textual Similarity DySem:通过多语言共识发现动态语义成分,用于计算语义文本相似度 large language model
15 User-Aware Active Knowledge Acquisition for Emotional Support Dialogue 提出用户感知主动知识获取框架,提升情感支持对话系统性能。 large language model
16 Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies 提出基于人类应试策略的语言模型,用于检查生成文本的事实性 large language model
17 Beyond English and Evasion: A Human-Annotated Multi-Domain Benchmark for High-Stakes LLM Safety Evaluation in Chinese 提出ChiSafe-PAS:一个中文多领域对抗性提示基准,用于评估大语言模型的安全性。 large language model
18 Evaluating Cross-lingual Knowledge Consistency in Code-Mixed vis-a-vis Indian Languages using IndicKLAR IndiKLAR揭示了代码混合输入在提升印度语言知识一致性方面的作用 large language model
19 Predicting Causal Effects from Natural Language Queries using Structured Representations 提出Query2Effect基准和两阶段框架,利用自然语言查询预测因果效应。 large language model
20 CONCAT: Consensus- and Confidence-Driven Ad Hoc Teaming for Efficient LLM-Based Multi-Agent Systems 提出CONCAT,一种基于共识和置信度的LLM多智能体高效协作框架 large language model
21 From Blind Guess to Informed Judgment: Teaching LLMs to Evaluate Materials by Building Knowledge-Augmented Preference Signals MaterEval:构建知识增强偏好信号,指导LLM进行材料评估 large language model
22 COFT: Counterfactual-Conformal Decoding for Fair Chain-of-Thought Reasoning in Large Language Models COFT:一种无训练的反事实-共形解码方法,用于大语言模型中公平的思维链推理 large language model chain-of-thought
23 Latent Performance Profiling of Large Language Models 提出Latent Performance Profiling (LPP),用于从隐空间评估大语言模型。 large language model
24 When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models 揭示大语言模型中英语叙事主导地位:以孟加拉语文化知识为例 large language model
25 Your Multimodal Speech Model Says I Have a Face for Radio 评估多模态语音识别模型中的人脸偏见,揭示显著的性别和种族差异。 multimodal
26 DySem: Uncovering Dynamic Semantic Components of Large Language Models for Calculating Semantic Textual Similarity DySem:通过动态语义成分挖掘提升大语言模型语义文本相似度计算 large language model
27 Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting 提出Mask the Target正则化方法,解决LoRA微调中的灾难性遗忘问题 large language model
28 Exploring Autonomous Agentic Data Engineering for Model Specialization 提出自主代理数据工程以解决模型专业化问题 large language model
29 Kronecker Embeddings: Byte-Level Structured Token Representations for Parameter-Efficient Language Models 提出Kronecker嵌入,通过字节级结构化表示显著降低语言模型参数量。 large language model
30 Implicit Identity Technologies for LLMs: Fingerprinting and Watermarking across Datasets, Models, and Generated Content 提出LLM隐式身份技术框架,用于指纹识别和水印,实现数据集、模型和生成内容溯源。 large language model
31 EUDAIMONIA: Evaluating Undesirable Dynamics in AI 提出社会AI设计规范以评估语言模型的社会动态问题 large language model
32 MOOSE-Copilot: A Web-Based Interactive Assistant for Unified Exploratory and Fine-Grained Scientific Hypothesis Discovery MOOSE-Copilot:用于统一探索式和精细化科学假设发现的交互式Web助手 large language model
33 Adaptive Interviewing for Persona Simulation in LLMs: Evidence-Grounded Reasoning Improves Decision Alignment 提出自适应访谈框架,提升LLM在个体决策模拟中的证据一致性 large language model
34 Beyond Bilingual Transfer: Multilingual Code-Switching in Instruction Tuning 多语言指令调优中,多语言Code-Switching超越双语迁移 large language model
35 DynSess: Dynamic Session-Level Evaluation and Optimization Framework for Role-Playing Agents DynSess:用于角色扮演Agent的动态会话级评估与优化框架 large language model
36 Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents AgentREVEAL揭示了Web检索如何降低LLM Agent的安全对齐,并提出了HarmURLBench基准。 large language model
37 Configurable Reward Model for Balanced Safety Alignment 提出可配置奖励模型(CSRM)以平衡大语言模型的安全性对齐问题 large language model
38 Can LLM Teams Play What? Where? When? LLM团队协作提升智力问答游戏表现,最高提升20个百分点 large language model
39 Cross-Lingual Steering for Figurative Language Generation 提出跨语言激活调控方法,探索并利用多语言LLM中比喻语言生成的通用信号。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (15 篇)

#题目一句话要点标签🔗
40 UniSteer: Text-Guided Flow Matching in Activation Space for Versatile LLM Steering UniSteer:提出文本引导的激活空间流匹配方法,实现通用LLM行为控制。 flow matching large language model instruction following
41 World Models in Words: Auditing Physical State-Transition Commitments in Vision-Language Models 提出审计物理状态转变承诺的框架以提升视觉语言模型评估 world model world models
42 GAPD: Gold-Action Policy Distillation for Agentic Reinforcement Learning in Knowledge Base Question Answering 提出GAPD框架,通过Gold-Action策略蒸馏提升知识库问答中Agent的强化学习效果。 reinforcement learning distillation
43 Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models 提出Canonical-Context On-Policy Distillation (CCOPD)以解决多轮对话中LLM的自锚定漂移问题 distillation large language model
44 Recovering Diversity Without Losing Alignment: A DPO Recipe for Post-Trained LLMs REDIPO:一种用于后训练LLM的DPO方法,在不损失对齐的情况下恢复多样性 DPO instruction following
45 Loong: A Human-Like Long Document Translation Agent with Observe-and-Act Adaptive Context Selection 提出Loong以解决长文档翻译中的上下文选择问题 reinforcement learning large language model
46 Adaptive Targeted Dynamic Chunking for Tokenization-Free Hierarchical Model 提出自适应目标动态分块(ATDC)方法,优化无Token化层级模型的压缩比。 curriculum learning large language model
47 Training Deliberative Monitors for Black-Box Scheming Detection 训练行动监督器以检测黑盒智能体的阴谋行为 reinforcement learning chain-of-thought
48 Reasoning-preserved Efficient Distillation of Large Language Models via Activation-aware Initialization RED:通过激活感知初始化,高效蒸馏并保持大语言模型的推理能力 distillation large language model
49 Improving Small Language Models for Code Generation with Reinforcement Learning from Verification Feedback 利用验证反馈的强化学习提升小型语言模型的代码生成能力 reinforcement learning reward design reward shaping
50 Draft-OPD: On-Policy Distillation for Speculative Draft Models 提出Draft-OPD,通过在线蒸馏提升推测草稿模型的加速效果 distillation large language model
51 On Asymmetric Optimization of Reasoning and Perception in Vision-Language Model Post-Training 针对视觉-语言模型后训练中推理与感知优化不对称问题,提出动态重加权损失和感知奖励机制。 reinforcement learning chain-of-thought
52 Speculative Decoding Across Languages 针对多语言场景,提出优化推测解码效率的三种策略,提升非英语语言的LLM生成速度。 distillation large language model
53 Source-Grounded Semantic Reinforcement Learning for Low-Resource Target-Language Generation 提出Source-Grounded Semantic RL,解决低资源目标语言生成中平行数据稀缺问题 reinforcement learning
54 Prompt-Level Reward Specifications for Open-Ended Post-Training 提出Prompt级别奖励规范框架,用于开放式后训练,提升响应质量。 reinforcement learning instruction following

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
55 Evaluating using Mock Tool Calls to Quarantine Untrusted Prompt Inputs 评估使用模拟工具调用隔离不可信Prompt输入以提升大语言模型安全性 manipulation large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
56 OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources OmniRetrieval:统一异构知识源的检索框架 affordance

⬅️ 返回 cs.CL 首页 · 🏠 返回主页