cs.CL（2026-04-17）

📊 共 32 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (24 🔗3) 支柱二：RL算法与架构 (RL & Architecture) (6 🔗1) 支柱一：机器人控制 (Robot Control) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (24 篇)

#	题目	一句话要点	标签	🔗
1	Towards Intrinsic Interpretability of Large Language Models:A Survey of Design Principles and Architectures	综述：面向大语言模型的内在可解释性设计原则与架构	large language model	✅
2	Learning Uncertainty from Sequential Internal Dispersion in Large Language Models	提出SIVR框架，利用LLM内部方差学习不确定性，提升幻觉检测泛化性。	large language model	✅
3	How Hypocritical Is Your LLM judge? Listener-Speaker Asymmetries in the Pragmatic Competence of Large Language Models	揭示LLM判断中的虚伪性：大型语言模型在语用能力上存在听者-说话者不对称性	large language model
4	A Systematic Study of Training-Free Methods for Trustworthy Large Language Models	系统性评估免训练方法在提升大语言模型可信度方面的有效性与权衡。	large language model
5	Optimizing Korean-Centric LLMs via Token Pruning	通过Token剪枝优化面向韩语的大语言模型，提升生成稳定性和翻译性能。	large language model instruction following
6	RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration	提出RAGognizer，通过集成检测头进行幻觉感知微调，提升RAG生成质量。	large language model
7	DiZiNER: Disagreement-guided Instruction Refinement via Pilot Annotation Simulation for Zero-shot Named Entity Recognition	DiZiNER：通过模拟Pilot标注过程，利用异构LLM解决零样本NER指令优化问题	large language model
8	From Benchmarking to Reasoning: A Dual-Aspect, Large-Scale Evaluation of LLMs on Vietnamese Legal Text	提出双重评估框架，大规模评测LLM在越南法律文本上的推理能力。	large language model
9	BAGEL: Benchmarking Animal Knowledge Expertise in Language Models	提出BAGEL基准以评估语言模型的动物知识能力	large language model
10	Can LLMs Understand the Impact of Trauma? Costs and Benefits of LLMs Coding the Interviews of Firearm Violence Survivors	评估LLM在枪支暴力幸存者访谈编码中的应用：成本与收益分析	large language model
11	GTA-2: Benchmarking General Tool Agents from Atomic Tool-Use to Open-Ended Workflows	GTA-2：构建通用工具智能体的分层基准，评估原子工具使用到开放式工作流的性能。	multimodal	✅
12	Imperfectly Cooperative Human-AI Interactions: Comparing the Impacts of Human and AI Attributes in Simulated and User Studies	研究人机交互中人类与AI属性的影响，揭示模拟与真实用户研究的差异	chain-of-thought
13	No Universal Courtesy: A Cross-Linguistic, Multi-Model Study of Politeness Effects on LLMs Using the PLUM Corpus	PLUM语料库揭示礼貌用语对LLM的影响：跨语言、多模型分析	large language model
14	SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation	提出基于LLM的框架以解决叙事文本中的词义消歧问题	large language model
15	Stochasticity in Tokenisation Improves Robustness	引入随机分词提升大语言模型对对抗攻击的鲁棒性	large language model
16	Disentangling Mathematical Reasoning in LLMs: A Methodological Investigation of Internal Mechanisms	通过解耦LLM内部机制，研究其数学推理能力	large language model
17	Exploring the Capability Boundaries of LLMs in Mastering of Chinese Chouxiang Language	提出Mouse基准，探索LLM在中文抽象语言理解上的能力边界	large language model
18	Qwen3.5-Omni Technical Report	Qwen3.5-Omni：基于混合专家注意力机制实现卓越的多模态理解与生成能力	visual grounding
19	CHOP: Chunkwise Context-Preserving Framework for RAG on Multi Documents	提出CHOP框架，通过分块上下文保持提升多文档RAG系统的检索精度。	large language model
20	MemEvoBench: Benchmarking Memory MisEvolution in LLM Agents	MemEvoBench：评估LLM Agent中记忆错误演化的基准测试	large language model
21	Skill-RAG: Failure-State-Aware Retrieval Augmentation via Hidden-State Probing and Skill Routing	Skill-RAG：通过隐状态探测和技能路由实现故障感知检索增强生成	large language model
22	Preference Estimation via Opponent Modeling in Multi-Agent Negotiation	提出一种基于对手建模的偏好估计方法，提升多方协商中的协议达成率和偏好估计精度。	large language model
23	C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment	C-Mining：通过几何错位无监督地发现文化数据合成的种子。	large language model
24	LLMs Corrupt Your Documents When You Delegate	揭示LLM在委托任务中易引入文档错误，提出DELEGATE-52基准评测	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (6 篇)

#	题目	一句话要点	标签	🔗
25	Improving Reasoning Capabilities in Small Models through Mixture-of-Layers Distillation with Stepwise Attention on Key Information	提出基于分层混合与逐步注意力蒸馏的小模型推理能力提升方法	distillation large language model chain-of-thought
26	LLMSniffer: Detecting LLM-Generated Code via GraphCodeBERT and Supervised Contrastive Learning	LLMSniffer：利用GraphCodeBERT和监督对比学习检测LLM生成的代码	contrastive learning large language model
27	CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization	提出CiPO，通过迭代偏好优化实现大型推理模型中知识的精确反学习。	preference learning large language model chain-of-thought
28	GroupDPO: Memory efficient Group-wise Direct Preference Optimization	提出GroupDPO，通过内存高效的分组直接偏好优化提升LLM对齐效果。	direct preference optimization large language model
29	Where does output diversity collapse in post-training?	揭示后训练语言模型输出多样性崩溃的根源在于数据构成而非推理方式	DPO distillation chain-of-thought
30	SCHK-HTC: Sibling Contrastive Learning with Hierarchical Knowledge-Aware Prompt Tuning for Hierarchical Text Classification	提出SCHK-HTC，通过层级知识提示调整和兄弟对比学习解决少样本层级文本分类中语义相似类别区分难题。	contrastive learning	✅

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
31	AtManRL: Towards Faithful Reasoning via Differentiable Attention Saliency	AtManRL：通过可微注意力显著性实现语言模型的可信推理	manipulation reinforcement learning large language model

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
32	DALM: A Domain-Algebraic Language Model via Three-Phase Structured Generation	DALM：通过三阶段结构化生成实现领域代数语言模型，解决领域知识干扰问题。	open-vocabulary open vocabulary large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页

cs.CL（2026-04-17）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (24 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (6 篇)

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理