cs.CL（2026-06-01）

📊 共 36 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (32 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (3 🔗2) 支柱一：机器人控制 (Robot Control) (1 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (32 篇)

#	题目	一句话要点	标签	🔗
1	CRAM: Centroid-Routing and Adaptive MoE for Multimodal Continual Instruction Tuning	CRAM：面向多模态持续指令调优的质心路由与自适应MoE	large language model multimodal
2	Unveiling the Entropy Dynamics of Chain-of-Thought Reasoning	揭示CoT推理的熵动态，提出基于CUSUM的免训练实时推理控制框架	chain-of-thought
3	Multilinguality of Large Language Models From a Structural Perspective	通过结构分析揭示大型语言模型的多语言能力	large language model
4	Unveiling the Limits of Large Language Models in Inferring Pragmatic Meaning from Non-Verbal Responses	评估大型语言模型在仅通过非语言反应推断语用意义方面的局限性	large language model
5	THRD: A Training-Free Multi-Turn Defense Framework for Jailbreak Attacks on Large Language Models	提出THRD，一种免训练的多轮对话防御框架，用于抵御大语言模型的越狱攻击。	large language model
6	SentGuard: Sentence-Level Streaming Guardrails for Large Language Models	提出SentGuard，一种句子级流式Guardrail，用于保障大语言模型的实时安全输出。	large language model
7	PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning	PaSBench-Video：用于主动安全预警的流视频基准测试	large language model multimodal
8	Easier to Mislead Than to Correct: Harmful and Beneficial Revision in LLM Conformity	研究表明LLM在群体决策中更易被误导而非纠正，需谨慎对待群体答案。	large language model chain-of-thought
9	K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts	提出K-BrowseComp：一个基于韩语环境的Web浏览Agent基准测试，用于评估和诊断LLM的Agent能力。	foundation model instruction following
10	Geometric Latent Reasoning Induces Shorter Generations in LLMs	提出几何潜在推理（GLR），通过隐空间路径近似缩短LLM生成长度。	large language model chain-of-thought
11	Better with Experience: Self-Evolving LLM Agents for Evidence-Grounded Health Community Notes	EvoNote：基于经验自进化的LLM Agent，用于生成证据充分的健康社区笔记	large language model multimodal
12	What to Format and How: A Benchmark and Workflow Approach for Document Formatting	提出DocFormBench和DocFormFlow，解决内容感知文档格式化难题。	large language model multimodal
13	FigSIM: A Dataset for Fine-grained Suicide Severity and Figurative Language in Suicide Memes	FigSIM：用于细粒度自杀倾向和隐喻表达的自杀梗数据集	multimodal
14	Investigating and Alleviating Harm Amplification in LLM Interactions	提出HarmAmp基准与TrajSafe主动防御框架，缓解LLM交互中的恶意放大问题	large language model
15	MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?	MMG2Skill：将Web指南提炼为可自我进化的智能体技能	multimodal
16	Resonant Context Anchoring: Decoupling Attention Routing and Signal Gain at Inference Time	提出共振上下文锚定(RCA)，在推理时解耦注意力路由和信号增益，提升LLM的事实一致性。	large language model
17	Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning	提出Chunk-Level Guided Generation，利用离线LLM作为过程评分器，无需训练即可提升数学推理能力。	large language model
18	From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression	SubFit：提出一种子模块粒度的LLM压缩方法，提升压缩效率和精度。	large language model	✅
19	SimSD: Simple Speculative Decoding in Diffusion Language Models	提出SimSD，一种用于扩散语言模型的高效推理解码算法，显著提升生成速度。	large language model
20	Identifying High-Confidence Social Biases in LLMs for Trustworthy Conversational Tutoring Agents	评估LLM在对话式辅导中高置信度社会偏见，提升教育场景可信度	large language model
21	Not What, But How: A Communicative Audit of LLM Response Framing	提出FRANZ框架，用于评估LLM在主观问题回答中的沟通方式	large language model
22	TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation	提出TVIR：构建深度研究Agent，用于生成文本-图像交错的报告	multimodal
23	Beyond Isolated Behaviors: Hierarchical User Modeling for LLM Personalization	提出PHF框架以解决LLM个性化问题	large language model
24	Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark	构建可控决策基准RVDB，揭示性别线索对LLM价值权衡的系统性影响	large language model
25	Cross-Environment Neural Reranking for Sample-Efficient Action Selection in Text-Based Agents	提出跨环境神经重排序方法，提升文本Agent在多任务场景下的样本效率。	large language model
26	CARTE: A Benchmark for Mapping Language Model Knowledge Across France	CARTE：一个评估LLM在法国区域知识推理能力的基准	large language model
27	Training Prompt Matters: State-Adaptive Optimization for Robust Fine-Tuning	提出状态自适应Prompt优化(SAPO)，提升微调LLM的泛化性和鲁棒性	large language model	✅
28	Mitigating Bias in Locally Constrained Decoding via Tractable Proposals	提出基于可处理提案的全局约束解码以缓解偏差问题	large language model
29	Cost-Aware Diffusion Draft Trees for Speculative Decoding	提出CaDDTree，通过优化token吞吐量实现更高效的推测解码。	instruction following
30	Encoded but Not Routed: Explaining the Table-Chart Gap in Scientific Claim Verification	揭示科学声明验证中表格-图表差距：信息编码但未有效路由	multimodal
31	When Meaning Travels: A Granular Lens on Hybrid-MoE's Role in Idiomatic Understanding for Language Models	提出Hybrid-MoE框架Varnika，提升语言模型在多语言成语理解中的表现。	multimodal
32	Benchmarking LLM-as-a-Judge for Long-Form Output Evaluation	提出LongJudgeBench以解决长文本输出评估的可靠性问题	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (3 篇)

#	题目	一句话要点	标签	🔗
33	Mechanistic Diagnostics of Spatial Lexical Bias in Multimodal Large Language Model Spatial Reasoning	揭示多模态大语言模型空间推理中的词汇偏见并提出轻量级修正方案	DPO large language model multimodal
34	ResMerge: Residual-based Spectral Merging of Large Language Models	ResMerge：基于残差的大语言模型谱合并方法，提升强化学习专家模型融合效果	reinforcement learning large language model	✅
35	Scaling Agentic Capabilities via Grounded Interaction Synthesis	提出GAIS，通过具身交互合成扩展Agent能力，提升数据效率和模型性能。	world model world models large language model	✅

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
36	On the Salience of Low-Probability Tokens for AI-Generated Text Detection: A Multiscale Uncertainty Perspective	提出基于多尺度不确定性的AI生成文本检测方法，关注低概率token的显著性。	manipulation	✅

⬅️ 返回 cs.CL 首页 · 🏠 返回主页

cs.CL（2026-06-01）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (32 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (3 篇)

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理