cs.CL（2026-01-20）

📊 共 36 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (28 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (8)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (28 篇)

#	题目	一句话要点	标签	🔗
1	Chain-of-Thought Compression Should Not Be Blind: V-Skip for Efficient Multimodal Reasoning via Dual-Path Anchoring	提出V-Skip，通过双路径锚定解决多模态CoT推理中的视觉失忆问题，实现高效压缩。	large language model multimodal chain-of-thought
2	FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs	FutureOmni：首个面向多模态LLM的、评估全模态上下文未来预测能力的基准	large language model multimodal	✅
3	Pro-AI Bias in Large Language Models	揭示大型语言模型中存在的亲AI偏见，可能影响决策。	large language model
4	RECAP: A Resource-Efficient Method for Adversarial Prompting in Large Language Models	RECAP：一种资源高效的LLM对抗提示方法，通过检索复用降低计算成本	large language model
5	Domain-Adaptation through Synthetic Data: Fine-Tuning Large Language Models for German Law	利用合成数据微调大语言模型，提升其在德国法律领域的问答能力	large language model
6	Towards robust long-context understanding of large language model via active recap learning	提出主动回顾学习（ARL）框架，增强LLM对长文本的理解能力。	large language model
7	No Reliable Evidence of Self-Reported Sentience in Small Large Language Models	通过内部激活分类验证，小型LLM自述无意识	large language model
8	Large Language Models for Large-Scale, Rigorous Qualitative Analysis in Applied Health Services Research	提出人机协同框架，利用大语言模型高效严谨地进行大规模定性健康服务研究	large language model
9	BACH-V: Bridging Abstract and Concrete Human-Values in Large Language Models	BACH-V：构建大语言模型中抽象与具体人类价值观的桥梁	large language model
10	Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models	提出“定位、引导、改进”框架，实现大语言模型可操作的机制可解释性	large language model	✅
11	OpenLearnLM Benchmark: A Unified Framework for Evaluating Knowledge, Skill, and Attitude in Educational Large Language Models	OpenLearnLM：用于评估教育大语言模型知识、技能和态度的统一基准	large language model
12	Activation-Space Anchored Access Control for Multi-Class Permission Reasoning in Large Language Models	提出AAAC框架，通过激活空间锚定实现大语言模型多类别权限控制	large language model
13	Top 10 Open Challenges Steering the Future of Diffusion Language Model and Its Variants	剖析扩散语言模型未来发展十大挑战，探索超越自回归范式的AI新方向	large language model multimodal
14	NewsRECON: News article REtrieval for image CONtextualization	NewsRECON：提出一种新闻文章检索方法，用于图像上下文推断，解决反向图像搜索失效问题。	large language model multimodal
15	Dimension-First Evaluation of Speech-to-Speech Models with Structured Acoustic Cues	提出TRACE框架以实现高效的人类对齐语音评估	large language model chain-of-thought
16	CommunityBench: Benchmarking Community-Level Alignment across Diverse Groups and Tasks	提出 CommunityBench，用于评估 LLM 在不同群体和任务中的社区层面价值观对齐能力	large language model foundation model
17	Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning	提出GRADFILTERING，利用不确定性指导指令调优数据选择，提升LLM效率。	large language model
18	XCR-Bench: A Multi-Task Benchmark for Evaluating Cultural Reasoning in LLMs	提出XCR-Bench基准，用于评估大型语言模型中的文化推理能力	large language model
19	OP-Bench: Benchmarking Over-Personalization for Memory-Augmented Personalized Conversational Agents	提出OP-Bench基准测试集，用于评估记忆增强对话Agent中的过度个性化问题	large language model
20	Simulated Ignorance Fails: A Systematic Study of LLM Behaviors on Forecasting Problems Before Model Knowledge Cutoff	揭示大语言模型预测中“模拟无知”的局限性，不建议用于回顾性基准测试。	chain-of-thought
21	Fairness or Fluency? An Investigation into Language Bias of Pairwise LLM-as-a-Judge	研究发现成对LLM评判器存在显著的语言偏见，并分析了其与困惑度的关系	large language model
22	TREX: Tokenizer Regression for Optimal Data Mixture	TREX：通过Tokenizer回归优化数据混合比例，提升多语言LLM分词器效率	large language model
23	When Wording Steers the Evaluation: Framing Bias in LLM judges	揭示LLM评判中的措辞偏差：提示框架影响LLM评判结果	large language model
24	Can LLM Reasoning Be Trusted? A Comparative Study: Using Human Benchmarking on Statistical Tasks	微调LLM提升统计推理能力，可用于教育和自动化评估	large language model
25	HALT: Hallucination Assessment via Latent Testing	HALT：通过隐空间测试评估大语言模型的幻觉问题	large language model
26	From Quotes to Concepts: Axial Coding of Political Debates with Ensemble LMs	利用集成语言模型对政治辩论进行轴向编码，实现从引言到概念的转换	large language model
27	GerAV: Towards New Heights in German Authorship Verification using Fine-Tuned LLMs on a New Benchmark	提出GerAV：一个用于德语作者身份验证的新基准，并利用微调LLM达到新高度	large language model
28	Beyond Known Facts: Generating Unseen Temporal Knowledge to Address Data Contamination in LLM Evaluation	提出一种基于生成未来知识的评估方法，解决LLM在时序知识图谱抽取任务中数据污染问题。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签
29	Pedagogical Alignment for Vision-Language-Action Models: A Comprehensive Framework for Data, Architecture, and Evaluation in Education	提出Pedagogical VLA Framework，用于资源受限教育场景下的可解释VLA模型。	distillation vision-language-action VLA
30	Temporal-Spatial Decouple before Act: Disentangled Representation Learning for Multimodal Sentiment Analysis	提出TSDA模型，通过时空解耦表示学习提升多模态情感分析性能	representation learning spatiotemporal multimodal
31	"The Whole Is Greater Than the Sum of Its Parts": A Compatibility-Aware Multi-Teacher CoT Distillation Framework	提出COMPACT框架，通过兼容性感知的多教师CoT蒸馏提升小模型推理能力。	teacher-student distillation large language model
32	RM-Distiller: Exploiting Generative LLM for Reward Model Distillation	提出RM-Distiller，利用生成式LLM进行奖励模型蒸馏，提升对齐效果	reinforcement learning distillation large language model
33	Dr. Assistant: Enhancing Clinical Diagnostic Inquiry via Structured Diagnostic Reasoning Data and Reinforcement Learning	提出Dr. Assistant，通过结构化推理数据和强化学习增强临床诊断问询能力	reinforcement learning large language model
34	ICPO: Illocution-Calibrated Policy Optimization for Multi-Turn Conversation	ICPO：针对多轮对话中指令歧义，提出语用校准策略优化方法	reinforcement learning large language model
35	Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment	提出Rank-Surprisal Ratio (RSR)指标，用于评估推理轨迹对学生模型学习的有效性。	distillation chain-of-thought
36	Knowledge Graph-Assisted LLM Post-Training for Enhanced Legal Reasoning	提出知识图谱辅助的LLM后训练方法，提升法律领域的推理能力	DPO direct preference optimization

⬅️ 返回 cs.CL 首页 · 🏠 返回主页

cs.CL（2026-01-20）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (28 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理