cs.CL(2026-04-17)

📊 共 32 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (24 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (6 🔗1) 支柱一:机器人控制 (Robot Control) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (24 篇)

#题目一句话要点标签🔗
1 Towards Intrinsic Interpretability of Large Language Models:A Survey of Design Principles and Architectures 综述:面向大语言模型的内在可解释性设计原则与架构 large language model
2 Learning Uncertainty from Sequential Internal Dispersion in Large Language Models 提出SIVR框架,利用LLM内部方差学习不确定性,提升幻觉检测泛化性。 large language model
3 How Hypocritical Is Your LLM judge? Listener-Speaker Asymmetries in the Pragmatic Competence of Large Language Models 揭示LLM判断中的虚伪性:大型语言模型在语用能力上存在听者-说话者不对称性 large language model
4 A Systematic Study of Training-Free Methods for Trustworthy Large Language Models 系统性评估免训练方法在提升大语言模型可信度方面的有效性与权衡。 large language model
5 Optimizing Korean-Centric LLMs via Token Pruning 通过Token剪枝优化面向韩语的大语言模型,提升生成稳定性和翻译性能。 large language model instruction following
6 RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration 提出RAGognizer,通过集成检测头进行幻觉感知微调,提升RAG生成质量。 large language model
7 DiZiNER: Disagreement-guided Instruction Refinement via Pilot Annotation Simulation for Zero-shot Named Entity Recognition DiZiNER:通过模拟Pilot标注过程,利用异构LLM解决零样本NER指令优化问题 large language model
8 From Benchmarking to Reasoning: A Dual-Aspect, Large-Scale Evaluation of LLMs on Vietnamese Legal Text 提出双重评估框架,大规模评测LLM在越南法律文本上的推理能力。 large language model
9 BAGEL: Benchmarking Animal Knowledge Expertise in Language Models 提出BAGEL基准以评估语言模型的动物知识能力 large language model
10 Can LLMs Understand the Impact of Trauma? Costs and Benefits of LLMs Coding the Interviews of Firearm Violence Survivors 评估LLM在枪支暴力幸存者访谈编码中的应用:成本与收益分析 large language model
11 GTA-2: Benchmarking General Tool Agents from Atomic Tool-Use to Open-Ended Workflows GTA-2:构建通用工具智能体的分层基准,评估原子工具使用到开放式工作流的性能。 multimodal
12 Imperfectly Cooperative Human-AI Interactions: Comparing the Impacts of Human and AI Attributes in Simulated and User Studies 研究人机交互中人类与AI属性的影响,揭示模拟与真实用户研究的差异 chain-of-thought
13 No Universal Courtesy: A Cross-Linguistic, Multi-Model Study of Politeness Effects on LLMs Using the PLUM Corpus PLUM语料库揭示礼貌用语对LLM的影响:跨语言、多模型分析 large language model
14 SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation 提出基于LLM的框架以解决叙事文本中的词义消歧问题 large language model
15 Stochasticity in Tokenisation Improves Robustness 引入随机分词提升大语言模型对对抗攻击的鲁棒性 large language model
16 Disentangling Mathematical Reasoning in LLMs: A Methodological Investigation of Internal Mechanisms 通过解耦LLM内部机制,研究其数学推理能力 large language model
17 Exploring the Capability Boundaries of LLMs in Mastering of Chinese Chouxiang Language 提出Mouse基准,探索LLM在中文抽象语言理解上的能力边界 large language model
18 Qwen3.5-Omni Technical Report Qwen3.5-Omni:基于混合专家注意力机制实现卓越的多模态理解与生成能力 visual grounding
19 CHOP: Chunkwise Context-Preserving Framework for RAG on Multi Documents 提出CHOP框架,通过分块上下文保持提升多文档RAG系统的检索精度。 large language model
20 MemEvoBench: Benchmarking Memory MisEvolution in LLM Agents MemEvoBench:评估LLM Agent中记忆错误演化的基准测试 large language model
21 Skill-RAG: Failure-State-Aware Retrieval Augmentation via Hidden-State Probing and Skill Routing Skill-RAG:通过隐状态探测和技能路由实现故障感知检索增强生成 large language model
22 Preference Estimation via Opponent Modeling in Multi-Agent Negotiation 提出一种基于对手建模的偏好估计方法,提升多方协商中的协议达成率和偏好估计精度。 large language model
23 C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment C-Mining:通过几何错位无监督地发现文化数据合成的种子。 large language model
24 LLMs Corrupt Your Documents When You Delegate 揭示LLM在委托任务中易引入文档错误,提出DELEGATE-52基准评测 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
25 Improving Reasoning Capabilities in Small Models through Mixture-of-Layers Distillation with Stepwise Attention on Key Information 提出基于分层混合与逐步注意力蒸馏的小模型推理能力提升方法 distillation large language model chain-of-thought
26 LLMSniffer: Detecting LLM-Generated Code via GraphCodeBERT and Supervised Contrastive Learning LLMSniffer:利用GraphCodeBERT和监督对比学习检测LLM生成的代码 contrastive learning large language model
27 CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization 提出CiPO,通过迭代偏好优化实现大型推理模型中知识的精确反学习。 preference learning large language model chain-of-thought
28 GroupDPO: Memory efficient Group-wise Direct Preference Optimization 提出GroupDPO,通过内存高效的分组直接偏好优化提升LLM对齐效果。 direct preference optimization large language model
29 Where does output diversity collapse in post-training? 揭示后训练语言模型输出多样性崩溃的根源在于数据构成而非推理方式 DPO distillation chain-of-thought
30 SCHK-HTC: Sibling Contrastive Learning with Hierarchical Knowledge-Aware Prompt Tuning for Hierarchical Text Classification 提出SCHK-HTC,通过层级知识提示调整和兄弟对比学习解决少样本层级文本分类中语义相似类别区分难题。 contrastive learning

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
31 AtManRL: Towards Faithful Reasoning via Differentiable Attention Saliency AtManRL:通过可微注意力显著性实现语言模型的可信推理 manipulation reinforcement learning large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
32 DALM: A Domain-Algebraic Language Model via Three-Phase Structured Generation DALM:通过三阶段结构化生成实现领域代数语言模型,解决领域知识干扰问题。 open-vocabulary open vocabulary large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页