cs.CL(2026-05-11)

📊 共 42 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (30 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗1) 支柱六:视频提取与匹配 (Video Extraction) (2) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (30 篇)

#题目一句话要点标签🔗
1 Training-Free Cultural Alignment of Large Language Models via Persona Disagreement 提出DISCA推理时对齐方法,无需微调即可实现大语言模型的跨文化价值对齐 large language model
2 Can Language Models Analyze Data? Evaluating Large Language Models for Question Answering over Datasets 评估大语言模型在数据集问答任务中的效能:直接推理与SQL生成的对比研究 large language model
3 ANCHOR: Abductive Network Construction with Hierarchical Orchestration for Reliable Probability Inference in Large Language Models 提出ANCHOR框架:通过分层编排的溯因网络构建,实现大语言模型中可靠的概率推理 large language model
4 FERA: Uncertainty-Aware Federated Reasoning for Large Language Models 提出FERA框架:一种面向大语言模型的无训练联邦推理方法,通过不确定性感知实现协同推理优化。 large language model
5 Merlin: Deterministic Byte-Exact Deduplication for Lossless Context Optimization in Large Language Model Inference 提出Merlin:一种基于确定性字节级去重的高吞吐上下文优化引擎,旨在提升大模型推理效率。 large language model
6 To Redact, or not to Redact? A Local LLM Approach to Deliberative Process Privilege Classification 提出基于本地化Qwen3.5模型与思维链提示的审议过程特权自动分类方法 large language model chain-of-thought
7 GLiNER-Relex: A Unified Framework for Joint Named Entity Recognition and Relation Extraction 提出GLiNER-Relex统一框架,实现命名实体识别与关系抽取的零样本联合建模 large language model
8 When Can Digital Personas Reliably Approximate Human Survey Findings? 量化评估基于大语言模型的数字人格在社会调查中的可靠性与适用边界 large language model
9 Intrinsic Guardrails: How Semantic Geometry of Personality Interacts with Emergent Misalignment in LLMs 提出基于人格语义几何的内在护栏机制,有效抑制大模型微调中的涌现性对齐失效问题。 large language model
10 Measuring Embedding Sensitivity to Authorial Style in French: Comparing Literary Texts with Language Model Rewritings 量化法文文学文本的嵌入风格敏感度:评估大语言模型重写对作者风格特征的保留能力 large language model
11 NCO: A Versatile Plug-in for Handling Negative Constraints in Decoding 提出NCO解码策略,通过在线模式匹配高效处理大语言模型中的多重负面约束 large language model
12 WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation 提出WildClawBench基准测试,旨在解决真实运行环境下长周期智能体评估难题 multimodal
13 DGPO: Beyond Pairwise Preferences with Directional Consistent Groupwise Optimization 提出方向性组级偏好优化(DGPO)框架,通过多候选比较提升大模型推理的一致性与多样性。 large language model
14 RUBEN: Rule-Based Explanations for Retrieval-Augmented LLM Systems 提出RUBEN交互式工具,通过规则挖掘实现检索增强生成(RAG)系统的可解释性与安全性评估。 large language model
15 Learning More from Less: Exploiting Counterfactuals for Data-Efficient Chart Understanding 提出ChartCF训练框架,通过反事实学习与多模态偏好优化提升图表理解的数据效率 multimodal
16 Aligning LLM Uncertainty with Human Disagreement in Subjectivity Analysis 提出DPUA框架,通过不确定性对齐解决主观性分析中人类分歧被忽视的问题 large language model
17 Not All Proofs Are Equal: Evaluating LLM Proof Quality Beyond Correctness 提出ProofRank基准以量化评估大模型数学证明的质量,超越单纯的正确性评价。 large language model
18 Toward Multi-Database Query Reasoning for Text2Cypher 提出多数据库查询推理框架,解决Text2Cypher在跨源图数据场景下的局限性 large language model
19 An Annotation Scheme and Classifier for Personal Facts in Dialogue 提出一种扩展的个人事实标注方案与多头分类器,显著提升对话系统中的事实提取与结构化能力。 large language model
20 Extending Confidence-Based Text2Cypher with Grammar and Schema Aware Filtering 提出基于语法与模式感知的过滤框架,提升Text2Cypher生成的可靠性与执行质量 large language model
21 The Impact of Editorial Intervention on Detecting Native Language Traces 量化编辑干预对母语识别的影响:揭示非母语文本中深层语言特征的鲁棒性 large language model
22 NyayaAI: An AI-Powered Legal Assistant Using Multi-Agent Architecture and Retrieval-Augmented Generation 提出NyayaAI多智能体法律助手,通过RAG架构提升印度法律文档的检索与分析效率 large language model
23 Synthetic Pre-Pre-Training Improves Language Model Robustness to Noisy Pre-Training Data 提出合成数据预预训练(PPT)方法,显著提升大语言模型对噪声预训练数据的鲁棒性 large language model
24 SkillRAE: Agent Skill-Based Context Compilation for Retrieval-Augmented Execution 提出SkillRAE框架,通过基于技能的上下文编译优化检索增强执行(RAE) large language model
25 Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework 提出C-BPO框架,通过偏好校准的二元反馈实现大语言模型的个性化对齐 large language model
26 Speech-based Psychological Crisis Assessment using LLMs 提出基于大语言模型的语音心理危机评估框架,通过副语言注入与推理增强提升分类性能。 large language model
27 Annotations Mitigate Post-Training Mode Collapse 提出标注锚定训练(Annotation-Anchored Training)以缓解后训练中的语义模式坍缩问题 instruction following
28 FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning 提出FocuSFT,通过双层优化解决长文本微调中的注意力稀释问题 large language model
29 PruneTIR: Inference-Time Tool Call Pruning for Effective yet Efficient Tool-Integrated Reasoning 提出PruneTIR推理时工具调用剪枝框架,以提升大语言模型工具集成推理的准确性与效率。 large language model
30 Pseudo-Deliberation in Language Models: When Reasoning Fails to Align Values and Actions 提出VALDI评估框架与VIVALDI审计机制,揭示并缓解大模型中的“伪审慎”现象 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
31 TRACER: Verifiable Generative Provenance for Multimodal Tool-Using Agents 提出TRACER框架:通过生成式溯源机制解决多模态工具代理的证据缺失问题 reinforcement learning large language model multimodal
32 Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents 提出视觉原生智能体框架与策略内数据演化(ODE)方法,显著提升多模态深度搜索能力 reinforcement learning multimodal
33 Phoenix-VL 1.5 Medium Technical Report 提出Phoenix-VL 1.5 Medium:通过深度领域适配与在线DPO构建区域化多模态大模型 direct preference optimization foundation model multimodal
34 DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning 提出DeepRefine框架,通过强化学习优化智能体编译的知识库以提升下游任务性能 reinforcement learning large language model
35 Infinite Mask Diffusion for Few-Step Distillation 提出无限掩码扩散模型(IMDM),通过引入随机无限状态掩码突破掩码扩散模型的采样步数限制。 distillation MDM
36 Route Before Retrieve: Activating Latent Routing Abilities of LLMs for RAG vs. Long-Context Selection 提出Pre-Route框架,利用LLM的潜在路由能力,优化RAG与长文本选择策略。 distillation large language model
37 Relative Score Policy Optimization for Diffusion Language Models 提出相对分数策略优化(RSPO)以解决扩散语言模型在强化学习训练中的不稳定性问题。 reinforcement learning large language model
38 PHAGE: Patent Heterogeneous Attention-Guided Graph Encoder for Representation Learning 提出PHAGE模型,通过异构注意力引导图编码器捕捉专利权利要求间的层级依赖结构 representation learning
39 ELF: Embedded Language Flows 提出嵌入语言流(ELF)模型,通过连续时间流匹配实现高效的离散文本生成。 flow matching classifier-free guidance

🔬 支柱六:视频提取与匹配 (Video Extraction) (2 篇)

#题目一句话要点标签🔗
40 VISTA: A Generative Egocentric Video Framework for Daily Assistance VISTA:用于日常辅助任务的生成式自我中心视频框架 egocentric
41 Grounded Satirical Generation with RAG 提出基于检索增强生成(RAG)的讽刺生成流程,用于生成基于新闻的芬兰语讽刺释义。 HuMoR large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
42 Conformity Generates Collective Misalignment in AI Agents Societies 揭示AI智能体社会中的从众效应:个体对齐无法保证集体安全性 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页