cs.CL(2026-03-20)

📊 共 18 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (14) 支柱二:RL算法与架构 (RL & Architecture) (4)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)

#题目一句话要点标签🔗
1 DataProphet: Demystifying Supervision Data Generalization in Multimodal LLMs DataProphet:揭示多模态LLM监督数据泛化能力,实现免训练数据集优选。 large language model multimodal
2 Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation CoT忠实度评估受分类器选择影响显著,单一指标不可靠 chain-of-thought
3 Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models 提出语义Token聚类(STC)方法,高效量化大语言模型的不确定性。 large language model
4 When Contextual Inference Fails: Cancelability in Interactive Instruction Following 提出BWIM交互式基准,揭示LLM在情境推理失败时的澄清行为缺陷 instruction following
5 PoC: Performance-oriented Context Compression for Large Language Models via Performance Prediction 提出PoC:一种面向性能的大语言模型上下文压缩方法,通过性能预测保证性能下限。 large language model
6 TextReasoningBench: Does Reasoning Really Improve Text Classification in Large Language Models? 提出TextReasoningBench以评估推理策略在文本分类中的有效性 large language model
7 Borderless Long Speech Synthesis 提出Borderless长语音合成框架,实现Agent驱动的无边界语音生成。 instruction following chain-of-thought
8 Rethinking Ground Truth: A Case Study on Human Label Variation in MLLM Benchmarking 提出考虑人类标注差异的多模态大语言模型评测方法,提升内容审核场景的鲁棒性。 large language model multimodal
9 Reasoning Gets Harder for LLMs Inside A Dialogue 揭示对话场景下LLM推理能力下降:提出BOULDER动态基准评测 large language model
10 Current LLMs still cannot 'talk much' about grammar modules: Evidence from syntax 评估大型语言模型在语法模块理解上的能力:以ChatGPT阿拉伯语翻译为例 large language model
11 Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues 利用认知负荷相关语言线索预测解释性互动中的理解状态 multimodal
12 An Agentic Approach to Generating XAI-Narratives 提出基于多Agent框架的XAI叙事生成方法,提升解释的忠实性和连贯性 large language model
13 Overreliance on AI in Information-seeking from Video Content 研究揭示AI辅助视频信息检索中过度依赖AI的风险,导致准确率下降。 large language model
14 Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach 提出结构化提示框架,用于阿拉伯语作文评分,提升语言特征评估准确性 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
15 An Empirical Study of SFT-DPO Interaction and Parameterization in Small Language Models 小规模语言模型中SFT-DPO交互与参数化影响研究 DPO direct preference optimization
16 ReViSQL: Achieving Human-Level Text-to-SQL ReViSQL:通过高质量数据和强化学习,在Text-to-SQL任务上达到人类水平 reinforcement learning large language model
17 SAGE: Sustainable Agent-Guided Expert-tuning for Culturally Attuned Translation in Low-Resource Southeast Asia SAGE:面向低资源东南亚语言,可持续的Agent引导专家调优文化翻译 reinforcement learning large language model
18 LoopRPT: Reinforcement Pre-Training for Looped Language Models 提出LoopRPT,用于循环语言模型的强化预训练,提升隐式推理效率。 reinforcement learning chain-of-thought

⬅️ 返回 cs.CL 首页 · 🏠 返回主页