cs.CL(2026-06-01)

📊 共 36 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (32 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (3 🔗2) 支柱一:机器人控制 (Robot Control) (1 🔗1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (32 篇)

#题目一句话要点标签🔗
1 CRAM: Centroid-Routing and Adaptive MoE for Multimodal Continual Instruction Tuning CRAM:面向多模态持续指令调优的质心路由与自适应MoE large language model multimodal
2 Unveiling the Entropy Dynamics of Chain-of-Thought Reasoning 揭示CoT推理的熵动态,提出基于CUSUM的免训练实时推理控制框架 chain-of-thought
3 Multilinguality of Large Language Models From a Structural Perspective 通过结构分析揭示大型语言模型的多语言能力 large language model
4 Unveiling the Limits of Large Language Models in Inferring Pragmatic Meaning from Non-Verbal Responses 评估大型语言模型在仅通过非语言反应推断语用意义方面的局限性 large language model
5 THRD: A Training-Free Multi-Turn Defense Framework for Jailbreak Attacks on Large Language Models 提出THRD,一种免训练的多轮对话防御框架,用于抵御大语言模型的越狱攻击。 large language model
6 SentGuard: Sentence-Level Streaming Guardrails for Large Language Models 提出SentGuard,一种句子级流式Guardrail,用于保障大语言模型的实时安全输出。 large language model
7 PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning PaSBench-Video:用于主动安全预警的流视频基准测试 large language model multimodal
8 Easier to Mislead Than to Correct: Harmful and Beneficial Revision in LLM Conformity 研究表明LLM在群体决策中更易被误导而非纠正,需谨慎对待群体答案。 large language model chain-of-thought
9 K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts 提出K-BrowseComp:一个基于韩语环境的Web浏览Agent基准测试,用于评估和诊断LLM的Agent能力。 foundation model instruction following
10 Geometric Latent Reasoning Induces Shorter Generations in LLMs 提出几何潜在推理(GLR),通过隐空间路径近似缩短LLM生成长度。 large language model chain-of-thought
11 Better with Experience: Self-Evolving LLM Agents for Evidence-Grounded Health Community Notes EvoNote:基于经验自进化的LLM Agent,用于生成证据充分的健康社区笔记 large language model multimodal
12 What to Format and How: A Benchmark and Workflow Approach for Document Formatting 提出DocFormBench和DocFormFlow,解决内容感知文档格式化难题。 large language model multimodal
13 FigSIM: A Dataset for Fine-grained Suicide Severity and Figurative Language in Suicide Memes FigSIM:用于细粒度自杀倾向和隐喻表达的自杀梗数据集 multimodal
14 Investigating and Alleviating Harm Amplification in LLM Interactions 提出HarmAmp基准与TrajSafe主动防御框架,缓解LLM交互中的恶意放大问题 large language model
15 MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills? MMG2Skill:将Web指南提炼为可自我进化的智能体技能 multimodal
16 Resonant Context Anchoring: Decoupling Attention Routing and Signal Gain at Inference Time 提出共振上下文锚定(RCA),在推理时解耦注意力路由和信号增益,提升LLM的事实一致性。 large language model
17 Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning 提出Chunk-Level Guided Generation,利用离线LLM作为过程评分器,无需训练即可提升数学推理能力。 large language model
18 From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression SubFit:提出一种子模块粒度的LLM压缩方法,提升压缩效率和精度。 large language model
19 SimSD: Simple Speculative Decoding in Diffusion Language Models 提出SimSD,一种用于扩散语言模型的高效推理解码算法,显著提升生成速度。 large language model
20 Identifying High-Confidence Social Biases in LLMs for Trustworthy Conversational Tutoring Agents 评估LLM在对话式辅导中高置信度社会偏见,提升教育场景可信度 large language model
21 Not What, But How: A Communicative Audit of LLM Response Framing 提出FRANZ框架,用于评估LLM在主观问题回答中的沟通方式 large language model
22 TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation 提出TVIR:构建深度研究Agent,用于生成文本-图像交错的报告 multimodal
23 Beyond Isolated Behaviors: Hierarchical User Modeling for LLM Personalization 提出PHF框架以解决LLM个性化问题 large language model
24 Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark 构建可控决策基准RVDB,揭示性别线索对LLM价值权衡的系统性影响 large language model
25 Cross-Environment Neural Reranking for Sample-Efficient Action Selection in Text-Based Agents 提出跨环境神经重排序方法,提升文本Agent在多任务场景下的样本效率。 large language model
26 CARTE: A Benchmark for Mapping Language Model Knowledge Across France CARTE:一个评估LLM在法国区域知识推理能力的基准 large language model
27 Training Prompt Matters: State-Adaptive Optimization for Robust Fine-Tuning 提出状态自适应Prompt优化(SAPO),提升微调LLM的泛化性和鲁棒性 large language model
28 Mitigating Bias in Locally Constrained Decoding via Tractable Proposals 提出基于可处理提案的全局约束解码以缓解偏差问题 large language model
29 Cost-Aware Diffusion Draft Trees for Speculative Decoding 提出CaDDTree,通过优化token吞吐量实现更高效的推测解码。 instruction following
30 Encoded but Not Routed: Explaining the Table-Chart Gap in Scientific Claim Verification 揭示科学声明验证中表格-图表差距:信息编码但未有效路由 multimodal
31 When Meaning Travels: A Granular Lens on Hybrid-MoE's Role in Idiomatic Understanding for Language Models 提出Hybrid-MoE框架Varnika,提升语言模型在多语言成语理解中的表现。 multimodal
32 Benchmarking LLM-as-a-Judge for Long-Form Output Evaluation 提出LongJudgeBench以解决长文本输出评估的可靠性问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
33 Mechanistic Diagnostics of Spatial Lexical Bias in Multimodal Large Language Model Spatial Reasoning 揭示多模态大语言模型空间推理中的词汇偏见并提出轻量级修正方案 DPO large language model multimodal
34 ResMerge: Residual-based Spectral Merging of Large Language Models ResMerge:基于残差的大语言模型谱合并方法,提升强化学习专家模型融合效果 reinforcement learning large language model
35 Scaling Agentic Capabilities via Grounded Interaction Synthesis 提出GAIS,通过具身交互合成扩展Agent能力,提升数据效率和模型性能。 world model world models large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
36 On the Salience of Low-Probability Tokens for AI-Generated Text Detection: A Multiscale Uncertainty Perspective 提出基于多尺度不确定性的AI生成文本检测方法,关注低概率token的显著性。 manipulation

⬅️ 返回 cs.CL 首页 · 🏠 返回主页