cs.CL（2026-03-20）

📊 共 18 篇论文

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (14) 支柱二：RL算法与架构 (RL & Architecture) (4)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
1	DataProphet: Demystifying Supervision Data Generalization in Multimodal LLMs	DataProphet：揭示多模态LLM监督数据泛化能力，实现免训练数据集优选。	large language model multimodal
2	Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation	CoT忠实度评估受分类器选择影响显著，单一指标不可靠	chain-of-thought
3	Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models	提出语义Token聚类(STC)方法，高效量化大语言模型的不确定性。	large language model
4	When Contextual Inference Fails: Cancelability in Interactive Instruction Following	提出BWIM交互式基准，揭示LLM在情境推理失败时的澄清行为缺陷	instruction following
5	PoC: Performance-oriented Context Compression for Large Language Models via Performance Prediction	提出PoC：一种面向性能的大语言模型上下文压缩方法，通过性能预测保证性能下限。	large language model
6	TextReasoningBench: Does Reasoning Really Improve Text Classification in Large Language Models?	提出TextReasoningBench以评估推理策略在文本分类中的有效性	large language model
7	Borderless Long Speech Synthesis	提出Borderless长语音合成框架，实现Agent驱动的无边界语音生成。	instruction following chain-of-thought
8	Rethinking Ground Truth: A Case Study on Human Label Variation in MLLM Benchmarking	提出考虑人类标注差异的多模态大语言模型评测方法，提升内容审核场景的鲁棒性。	large language model multimodal
9	Reasoning Gets Harder for LLMs Inside A Dialogue	揭示对话场景下LLM推理能力下降：提出BOULDER动态基准评测	large language model
10	Current LLMs still cannot 'talk much' about grammar modules: Evidence from syntax	评估大型语言模型在语法模块理解上的能力：以ChatGPT阿拉伯语翻译为例	large language model
11	Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues	利用认知负荷相关语言线索预测解释性互动中的理解状态	multimodal
12	An Agentic Approach to Generating XAI-Narratives	提出基于多Agent框架的XAI叙事生成方法，提升解释的忠实性和连贯性	large language model
13	Overreliance on AI in Information-seeking from Video Content	研究揭示AI辅助视频信息检索中过度依赖AI的风险，导致准确率下降。	large language model
14	Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach	提出结构化提示框架，用于阿拉伯语作文评分，提升语言特征评估准确性	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
15	An Empirical Study of SFT-DPO Interaction and Parameterization in Small Language Models	小规模语言模型中SFT-DPO交互与参数化影响研究	DPO direct preference optimization
16	ReViSQL: Achieving Human-Level Text-to-SQL	ReViSQL：通过高质量数据和强化学习，在Text-to-SQL任务上达到人类水平	reinforcement learning large language model
17	SAGE: Sustainable Agent-Guided Expert-tuning for Culturally Attuned Translation in Low-Resource Southeast Asia	SAGE：面向低资源东南亚语言，可持续的Agent引导专家调优文化翻译	reinforcement learning large language model
18	LoopRPT: Reinforcement Pre-Training for Looped Language Models	提出LoopRPT，用于循环语言模型的强化预训练，提升隐式推理效率。	reinforcement learning chain-of-thought

⬅️ 返回 cs.CL 首页 · 🏠 返回主页