cs.CL(2025-05-29)

📊 共 29 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (19 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (9 🔗1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (19 篇)

#题目一句话要点标签🔗
1 TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine 提出TCM-Ladder,一个用于评估中医多模态问答的大规模基准数据集。 large language model multimodal
2 A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models 首个针对大型视觉语言模型中偏见与思维链忠实性的综合研究 large language model chain-of-thought
3 SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models SocialMaze:用于评估大语言模型社会推理能力的新基准 large language model chain-of-thought
4 ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs 提出ARC框架,分析指令微调LLM在零样本长文档摘要中对论证信息的覆盖程度。 large language model instruction following
5 Large Language Model Meets Constraint Propagation GenCP结合掩码语言模型,实现更可靠的约束感知文本生成 large language model
6 FLAT-LLM: Fine-grained Low-rank Activation Space Transformation for Large Language Model Compression FLAT-LLM:基于细粒度低秩激活空间变换的大语言模型压缩方法 large language model
7 Retrieval Augmented Generation based Large Language Models for Causality Mining 提出基于检索增强生成的大语言模型因果关系挖掘方法,提升因果检测与抽取性能。 large language model
8 Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models 提出PCBench基准测试,评估大型语言模型对错误前提的批判能力 large language model
9 Gaussian mixture models as a proxy for interacting language models 提出交互高斯混合模型,作为交互语言模型的代理,用于社会科学研究。 large language model
10 Diversity of Transformer Layers: One Aspect of Parameter Scaling Laws Transformer层多样性:参数缩放规律的一个重要方面 large language model
11 Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs 提出UCerF不确定性感知公平性指标,用于评估大型语言模型中的内隐偏差。 large language model
12 SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving SwingArena:用于长上下文GitHub Issue解决的竞争性编程平台 large language model
13 Probing Association Biases in LLM Moderation Over-Sensitivity 揭示LLM内容审核过度敏感中的主题联想偏见,提出主题联想分析方法。 large language model
14 One Task Vector is not Enough: A Large-Scale Study for In-Context Learning 研究表明,单任务向量不足以支持LLM的上下文学习,复杂任务需多向量表示。 large language model
15 Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time 提出SITAlign,通过满意策略在推理时对齐LLM,提升多目标对齐效果。 large language model
16 LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition DEER:一种标签引导的上下文学习方法,提升LLM在命名实体识别中的性能 large language model
17 Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation 解耦抽象建模与算术计算,揭示LLM在数学问题中推理能力的瓶颈 large language model
18 ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions ToolHaystack:用于压力测试工具增强语言模型在真实长期交互中的性能的基准测试。 large language model
19 AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale Corpora AutoSchemaKG:通过动态模式归纳从Web规模语料库中自主构建知识图谱 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
20 Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation 提出主动层对比解码(ActLCD)以减少大语言模型生成中的幻觉问题 reinforcement learning large language model
21 DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning DeepTheorem:利用自然语言和强化学习提升LLM定理证明能力 reinforcement learning IMoS large language model
22 Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation 提出LoVeC:利用强化学习提升长文本生成中置信度表达的质量 reinforcement learning DPO large language model
23 The Surprising Soupability of Documents in State Space Models 提出文档混合(Document Souping)方法,提升状态空间模型在长文档推理中的性能。 Mamba SSM state space model
24 ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering 提出基于强化学习的LLM智能体ML-Agent,实现自主机器学习工程 reinforcement learning large language model
25 LoLA: Low-Rank Linear Attention With Sparse Caching LoLA:低秩线性注意力与稀疏缓存,提升终身学习中的关联记忆 linear attention
26 Are Reasoning Models More Prone to Hallucination? 研究表明,推理模型在事实性任务中可能更容易产生幻觉,但可通过特定训练流程缓解。 distillation chain-of-thought
27 Act-Adaptive Margin: Dynamically Calibrating Reward Models for Subjective Ambiguity 提出Act-Adaptive Margin (AAM)动态校准奖励模型,提升主观任务中奖励建模性能。 reinforcement learning preference learning
28 Table-R1: Inference-Time Scaling for Table Reasoning Table-R1:探索表格推理任务的推理时缩放技术,提升小模型性能。 reinforcement learning distillation

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
29 Hidden Persuasion: Detecting Manipulative Narratives on Social Media During the 2022 Russian Invasion of Ukraine 针对乌克兰社交媒体操纵性叙事,提出基于Gemma 2和XLM-RoBERTa的检测方案。 manipulation

⬅️ 返回 cs.CL 首页 · 🏠 返回主页