cs.CL(2025-05-26)

📊 共 66 篇论文 | 🔗 12 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (51 🔗7) 支柱二:RL算法与架构 (RL & Architecture) (14 🔗5) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (51 篇)

#题目一句话要点标签🔗
1 Thinking with Visual Abstract: Enhancing Multimodal Reasoning via Visual Abstraction 提出视觉抽象思维(VAT)方法,提升多模态大语言模型在视觉推理任务中的性能。 large language model multimodal chain-of-thought
2 SEMMA: A Semantic Aware Knowledge Graph Foundation Model 提出SEMMA以解决知识图谱推理中的语义不足问题 large language model foundation model
3 WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback WebCoT:通过重构思维链提升Web Agent在反思、分支和回滚中的推理能力 large language model chain-of-thought
4 ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs 提出ALAS:一种用于评估多模态LLM中语音-文本潜在对齐的自动指标 large language model multimodal
5 MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding 提出MangaVQA基准和MangaLMM模型,用于提升多模态漫画理解能力 multimodal
6 MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks MedHELM:用于医学任务的大语言模型全面评估框架 large language model
7 Beyond Keywords: Evaluating Large Language Model Classification of Nuanced Ableism 评估大语言模型对细微歧视性语言的分类能力,揭示其在自闭症歧视识别上的局限性。 large language model
8 Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects 多模态对话情感识别综述:方法、趋势、挑战与展望 multimodal
9 Large Language Models for IT Automation Tasks: Are We There Yet? ITAB基准测试揭示大语言模型在IT自动化任务中,特别是Ansible脚本生成方面的局限性 large language model
10 WXImpactBench: A Disruptive Weather Impact Understanding Benchmark for Evaluating Large Language Models WXImpactBench:构建天气灾害影响理解基准,评估大语言模型在气候适应中的能力 large language model
11 THiNK: Can Large Language Models Think-aloud? THiNK:提出基于Bloom分类的多智能体反馈框架,评估并提升LLM的高阶思维能力。 large language model
12 Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers 提出EXSEARCH,通过迭代自激励提升大语言模型在知识密集型任务中的搜索能力 large language model
13 ResSVD: Residual Compensated SVD for Large Language Model Compression ResSVD:一种残差补偿的SVD大语言模型压缩方法 large language model
14 Language-Agnostic Suicidal Risk Detection Using Large Language Models 提出语言无关的自杀风险检测框架以解决现有方法局限性 large language model
15 Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities 综述性研究:探讨大型语言模型与知识图谱融合在问答任务中的方法与机遇 large language model
16 MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning 提出MA-RAG多智能体框架,通过协同CoT推理解决复杂信息检索增强生成任务。 chain-of-thought
17 MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language Models 提出MiniLongBench,一种低成本的长文本理解大语言模型评测基准 large language model
18 FoodTaxo: Generating Food Taxonomies with Large Language Models FoodTaxo:利用大型语言模型自动生成食品分类体系 large language model
19 T^2Agent A Tool-augmented Multimodal Misinformation Detection Agent with Monte Carlo Tree Search 提出T^2Agent,一种基于蒙特卡洛树搜索的工具增强型多模态虚假信息检测Agent。 multimodal
20 Reasoning LLMs are Wandering Solution Explorers 揭示推理LLM缺乏系统性探索能力,指出其为游荡式问题解决者 large language model chain-of-thought
21 MemGuide: Intent-Driven Memory Selection for Goal-Oriented Multi-Session LLM Agents 提出MemGuide框架,通过意图驱动的记忆选择提升多轮对话LLM智能体的任务连贯性。 large language model chain-of-thought
22 OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction OmniCharacter:提出一种无缝语音-语言个性化交互模型,实现沉浸式角色扮演Agent。 large language model
23 SelfReflect: Can LLMs Communicate Their Internal Answer Distribution? 提出SelfReflect指标,评估LLM能否有效传达其内部答案分布的不确定性 large language model
24 Does quantization affect models' performance on long-context tasks? 系统评估量化对长文本LLM性能的影响,揭示任务、模型和量化方法的依赖性。 large language model
25 Paths Not Taken: Understanding and Mending the Multilingual Factual Recall Pipeline 揭示多语言LLM事实性召回pipeline,提出向量干预提升跨语言一致性 large language model
26 Gatsby Without the 'E': Crafting Lipograms with LLMs 利用大型语言模型生成有限制性文本:探索无'e'小说的创作 large language model
27 Amulet: Putting Complex Multi-Turn Conversations on the Stand with LLM Juries Amulet:利用LLM陪审团评估复杂多轮对话,提升评判准确性 large language model
28 HAMburger: Accelerating LLM Inference via Token Smashing HAMburger:通过Token压缩加速LLM推理,实现KV缓存和计算的亚线性增长。 large language model
29 Enhancing the Comprehensibility of Text Explanations via Unsupervised Concept Discovery 提出ECO-Concept框架,无需标注自动发现文本解释中的可理解概念。 large language model
30 FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models FLAME-MoE:开源混合专家语言模型研究平台,促进可复现性研究 large language model
31 Estimating LLM Consistency: A User Baseline vs Surrogate Metrics 揭示LLM一致性度量与人类感知的偏差,提出logit集成方法提升对齐度 large language model
32 Improving the OOD Performance of Closed-Source LLMs on NLI Through Strategic Data Selection 通过策略性数据选择提升闭源LLM在NLI任务上的OOD泛化性能 large language model
33 Reasoning Is Not All You Need: Examining LLMs for Multi-Turn Mental Health Conversations 提出MedAgent框架与MHSD数据集,评估LLM在多轮心理健康对话中的表现 large language model
34 Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs Pangu Light:通过权重重初始化加速和压缩大语言模型 large language model
35 UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models UORA:大模型参数高效微调的均匀正交重初始化适配方法 large language model
36 Semantic-Preserving Adversarial Attacks on LLMs: An Adaptive Greedy Binary Search Approach 提出自适应贪婪二分搜索AGBS,用于LLM的语义保持对抗攻击。 large language model
37 TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent TrojanStego:提出一种基于语言模型隐写术的隐私泄露攻击方法 large language model
38 Named Entity Recognition in Historical Italian: The Case of Giacomo Leopardi's Zibaldone 针对历史意大利语,提出基于BERT和LLaMa的命名实体识别方法。 large language model
39 Multi-Domain Explainability of Preferences 提出多领域偏好可解释性方法,提升LLM对人类偏好的理解与对齐。 large language model
40 Inference-time Alignment in Continuous Space 提出SEA算法,通过连续空间梯度优化实现大语言模型推理时对齐。 large language model
41 Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks 提出基于PCFG的LLM不确定性量化框架,提升自动推理任务可靠性 large language model
42 Training LLM-Based Agents with Synthetic Self-Reflected Trajectories and Partial Masking STeP:通过合成自反思轨迹和局部掩码训练LLM驱动的智能体 large language model
43 Emergent LLM behaviors are observationally equivalent to data leakage 大型语言模型涌现行为的解释:数据泄露而非社会规范 large language model
44 DeepDialogue: A Multi-Turn Emotionally-Rich Spoken Dialogue Dataset DeepDialogue:一个多轮、情感丰富的口语对话数据集,促进类人对话系统研究。 multimodal
45 APE: Selective Fine-tuning with Acceptance Criteria for Language Model Adaptation APE:基于接受准则的选择性微调方法用于语言模型自适应 large language model
46 Beyond Specialization: Benchmarking LLMs for Transliteration of Indian Languages 评估大型语言模型在印度语言音译中的表现 large language model
47 Improving Multilingual Math Reasoning for African Languages 针对非洲语言,研究者探索提升LLM在数学推理任务上的多语言能力的方法。 large language model
48 Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective 从元学习视角解读:将LLM推理轨迹视为参数优化的伪梯度下降 large language model
49 Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks 系统性地探索LLM中的意识:理论、实现与前沿风险的综述 large language model
50 MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs MOLE:利用大语言模型自动提取和验证科研论文元数据 large language model
51 What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs 研究揭示长文本情境下大语言模型在多示例攻击中的脆弱性,强调上下文长度是关键因素。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
52 Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models 提出MARA:一种微调LLM的token级Accept-Reject对齐方法,降低计算成本。 preference learning RLHF DPO
53 Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles Enigmata:通过合成可验证谜题提升大语言模型逻辑推理能力 reinforcement learning large language model
54 MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning 提出MT³框架,通过多任务强化学习提升MLLM在文本图像机器翻译任务上的性能。 reinforcement learning large language model multimodal
55 MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability 提出MaskSearch预训练框架,提升LLM智能体在开放域多跳问答中的搜索能力 reinforcement learning curriculum learning distillation
56 Adaptive Deep Reasoning: Triggering Deep Thinking When Needed 提出自适应深度推理方法,根据问题复杂度动态切换长短推理链,提升LLM推理效率。 reinforcement learning large language model chain-of-thought
57 Incentivizing Strong Reasoning from Weak Supervision 提出弱监督激励方法,以低成本提升大语言模型的推理能力 reinforcement learning large language model chain-of-thought
58 REARANK: Reasoning Re-ranking Agent via Reinforcement Learning 提出REARANK:基于强化学习的LLM推理重排序Agent,显著提升信息检索性能与可解释性 reinforcement learning large language model
59 Does Rationale Quality Matter? Enhancing Mental Disorder Detection via Selective Reasoning Distillation 通过选择性推理蒸馏提升心理健康检测效果,关注高质量理由 distillation large language model
60 R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning 提出R3-RAG,通过强化学习驱动LLM进行逐步推理和检索,提升RAG系统性能。 reinforcement learning large language model
61 Enhancing Logical Reasoning in Language Models via Symbolically-Guided Monte Carlo Process Supervision 提出基于符号引导的蒙特卡洛过程监督方法,提升语言模型的逻辑推理能力 DPO direct preference optimization large language model
62 One-shot Entropy Minimization 提出单样本熵最小化方法,仅需少量数据即可显著提升大语言模型性能 reinforcement learning large language model
63 Efficient Speech Translation through Model Compression and Knowledge Distillation 提出结合模型压缩与知识蒸馏的高效语音翻译方法,提升大模型部署效率。 distillation
64 Token Distillation: Attention-aware Input Embeddings For New Tokens 提出Token Distillation,通过注意力蒸馏为新token快速学习高质量嵌入表示。 distillation
65 REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Large Reasoning Models REA-RL:面向高效大型推理模型的反射感知在线强化学习 reinforcement learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
66 Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking 提出自适应无分类器引导(A-CFG),通过动态低置信度掩码提升生成模型的可控性。 classifier-free guidance

⬅️ 返回 cs.CL 首页 · 🏠 返回主页