cs.CL(2026-01-29)

📊 共 41 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (29 🔗6) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (29 篇)

#题目一句话要点标签🔗
1 Reasoning While Asking: Transforming Reasoning Large Language Models from Passive Solvers to Proactive Inquirers 提出主动交互推理PIR,将LLM从被动求解器转变为主动询问者,提升推理性能。 large language model chain-of-thought
2 On the Paradoxical Interference between Instruction-Following and Task Solving 揭示指令遵循对LLM任务解决能力的悖论式干扰,并提出SUSTAINSCORE进行量化 large language model instruction following
3 Scaling Reasoning Hop Exposes Weaknesses: Demystifying and Improving Hop Generalization in Large Language Models 提出测试时推理修正方法,提升大语言模型在推理跳数泛化上的能力 large language model chain-of-thought
4 A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine 提出Fed-MedLoRA框架,用于医学领域大语言模型的联邦式参数高效训练。 large language model
5 $G^2$-Reader: Dual Evolving Graphs for Multimodal Document QA 提出G²-Reader双图演化框架,解决多模态文档QA中结构断裂和检索漂移问题 multimodal
6 Mil-SCORE: Benchmarking Long-Context Geospatial Reasoning and Planning in Large Language Models Mil-SCORE:提出军事场景下长上下文地理空间推理与规划基准 large language model
7 Temporal Guidance for Large Language Models 提出时间引导(TeGu)方法,提升大语言模型生成质量并降低计算开销。 large language model
8 SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models SHARP:通过风险剖析进行社会危害分析,衡量大型语言模型中的不公平性 large language model
9 Parametric Knowledge is Not All You Need: Toward Honest Large Language Models via Retrieval of Pretraining Data 利用预训练数据检索,提升大语言模型回答问题的诚实度 large language model
10 When "Better" Prompts Hurt: Evaluation-Driven Iteration for LLM Applications 提出基于评估驱动的LLM应用迭代工作流,解决提示工程中的trade-off问题 large language model instruction following
11 CausalEmbed: Auto-Regressive Multi-Vector Generation in Latent Space for Visual Document Embedding CausalEmbed:面向视觉文档嵌入的隐空间自回归多向量生成方法 large language model multimodal
12 FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale FineInstructions:通过扩展合成指令数据到预训练规模,提升LLM性能 large language model
13 ECO: Quantized Training without Full-Precision Master Weights ECO:无需全精度Master Weights的量化训练方法 large language model
14 FIT: Defying Catastrophic Forgetting in Continual LLM Unlearning FIT框架:应对LLM持续卸载中的灾难性遗忘问题 large language model
15 Thinking Out of Order: When Output Order Stops Reflecting Reasoning Order in Diffusion Language Models 提出掩蔽扩散语言模型以解决自回归模型的推理顺序问题 chain-of-thought
16 Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text 提出Learn-to-Distance算法,自适应学习文本距离以检测LLM生成内容 large language model
17 Toward Culturally Aligned LLMs through Ontology-Guided Multi-Agent Reasoning 提出OG-MAR框架,通过本体引导的多Agent推理提升LLM的文化一致性。 large language model
18 Conversation for Non-verifiable Learning: Self-Evolving LLMs through Meta-Evaluation 提出CoNL框架,通过元评估自进化LLM,解决非验证性任务训练难题。 large language model
19 VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning 提出VTC-R1,通过视觉-文本压缩提升长上下文推理效率。 large language model
20 MasalBench: A Benchmark for Contextual and Cross-Cultural Understanding of Persian Proverbs in LLMs MasalBench:构建波斯谚语理解基准,评估LLM的语境和跨文化能力 large language model
21 Embodied Task Planning via Graph-Informed Action Generation with Large Lanaguage Model 提出GiG框架,利用图结构信息提升LLM在具身任务规划中的长程策略连贯性。 large language model
22 Zonkey: A Hierarchical Diffusion Language Model with Differentiable Tokenization and Probabilistic Attention Zonkey:提出一种可微分分词和概率注意力机制的层级扩散语言模型,实现端到端优化。 large language model
23 Why Attention Patterns Exist: A Unifying Temporal Perspective Analysis 提出TAPPA框架,从时序角度统一解释LLM注意力模式并指导推理加速。 large language model
24 Scale-Dependent Semantic Dynamics Revealed by Allan Deviation 利用Allan偏差揭示语义动态的尺度依赖性 large language model
25 AdaptBPE: From General Purpose to Specialized Tokenizers AdaptBPE提出了一种后训练的tokenizer自适应方法,提升特定领域或语言的LLM效率。 large language model
26 DimStance: Multilingual Datasets for Dimensional Stance Analysis DimStance:提出多语言情感维度立场分析数据集,用于细粒度情感感知立场检测。 large language model
27 User-Centric Evidence Ranking for Attribution and Fact Verification 提出证据排序任务,优化用户在事实核查中的证据阅读效率和准确性 large language model
28 MGSM-Pro: A Simple Strategy for Robust Multilingual Mathematical Reasoning Evaluation MGSM-Pro:一种稳健的多语言数学推理评估策略 large language model
29 Scaling Embeddings Outperforms Scaling Experts in Language Models 语言模型中,扩展嵌入层优于扩展专家层,并提出LongCat-Flash-Lite模型。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
30 SOUP: Token-level Single-sample Mix-policy Reinforcement Learning for Large Language Models SOUP:用于大语言模型的Token级单样本混合策略强化学习 reinforcement learning policy learning large language model
31 TACLer: Tailored Curriculum Reinforcement Learning for Efficient Reasoning 提出TACLer以提高长链推理的学习与推理效率 reinforcement learning curriculum learning large language model
32 Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts 提出HALO和HypeNet,实现Transformer到混合模型的低成本高效蒸馏,提升长文本处理能力。 linear attention distillation
33 DynaWeb: Model-Based Reinforcement Learning of Web Agents DynaWeb:提出一种基于模型的强化学习框架,用于训练Web智能体。 reinforcement learning world model large language model
34 Distribution-Aware Reward Estimation for Test-Time Reinforcement Learning 提出Distribution-Aware Reward Estimation (DARE)以提升测试时强化学习中LLM的自提升效果 reinforcement learning large language model
35 Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding Token-Guard:提出一种基于自校验解码的token级别幻觉控制方法,提升LLM生成可靠性。 reinforcement learning RLHF large language model
36 OVD: On-policy Verbal Distillation 提出On-policy Verbal Distillation,解决强化学习中知识蒸馏的内存瓶颈问题。 reinforcement learning distillation
37 Self-Improving Pretraining: using post-trained models to pretrain better models 提出自提升预训练方法,利用后训练模型指导预训练,提升大语言模型的安全性、事实性和生成质量。 reinforcement learning large language model
38 ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas ASTRA:自动化合成Agent轨迹与强化学习环境,提升工具增强语言模型Agent能力 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
39 The Compliance Paradox: Semantic-Instruction Decoupling in Automated Academic Code Evaluation 揭示合规性悖论:自动化代码评估中语义指令解耦问题 manipulation RLHF large language model
40 ILRR: Inference-Time Steering Method for Masked Diffusion Language Models 提出ILRR:一种用于Masked扩散语言模型的推理时引导方法 trajectory optimization

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
41 Do Not Waste Your Rollouts: Recycling Search Experience for Efficient Test-Time Scaling 提出RSE:通过经验回收利用提升大语言模型测试时推理效率 IMoS large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页