cs.CL(2025-05-20)

📊 共 109 篇论文 | 🔗 15 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (83 🔗12) 支柱二:RL算法与架构 (RL & Architecture) (24 🔗3) 支柱八:物理动画 (Physics-based Animation) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (83 篇)

#题目一句话要点标签🔗
1 Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs 提出内部思维链以提升大语言模型的任务执行透明度 large language model chain-of-thought
2 ABBA-Adapters: Efficient and Expressive Fine-Tuning of Foundation Models 提出ABBA以解决大语言模型高效适应新领域的问题 large language model foundation model
3 EfficientLLM: Efficiency in Large Language Models 提出EfficientLLM以解决大语言模型效率问题 large language model foundation model
4 ModRWKV: Transformer Multimodality in Linear Time 提出ModRWKV以解决多模态学习中的计算复杂性问题 large language model multimodal
5 Enhanced Multimodal Aspect-Based Sentiment Analysis by LLM-Generated Rationales 提出LRSA框架以解决多模态情感分析中的信息整合问题 large language model multimodal
6 CAFES: A Collaborative Multi-Agent Framework for Multi-Granular Multimodal Essay Scoring 提出CAFES框架以解决多模态自动作文评分问题 large language model multimodal
7 DecIF: Improving Instruction-Following through Meta-Decomposition 提出DecIF框架以解决指令跟随数据生成的灵活性问题 large language model instruction following
8 Large Language Models Implicitly Learn to See and Hear Just By Reading 提出通过文本训练实现视觉与听觉理解的长语言模型 large language model
9 Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models 提出Saten以解决大语言模型压缩问题 large language model
10 Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning 提出N-rep一致性以降低文本到SQL转换成本 chain-of-thought
11 Scaling Laws for State Dynamics in Large Language Models 探讨大语言模型状态动态的规模法则 large language model
12 Toward Reliable Scientific Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models 提出TruthHypo与KnowHD以解决科学假设生成的真实性问题 large language model
13 Attributional Safety Failures in Large Language Models under Code-Mixed Perturbations 提出SDA框架以解决代码混合下LLM的安全性问题 large language model
14 Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models 提出LaTen以解决大规模语言模型间知识转移问题 large language model
15 DiagnosisArena: Benchmarking Diagnostic Reasoning for Large Language Models 提出DiagnosisArena以评估大型语言模型的诊断推理能力 large language model
16 Development and Validation of Engagement and Rapport Scales for Evaluating User Experience in Multimodal Dialogue Systems 提出用户体验评估量表以优化多模态对话系统 multimodal
17 Multimodal Cultural Safety: Evaluation Framework and Alignment Strategies 提出CROSS基准以评估大型视觉语言模型的文化安全性 multimodal
18 DECASTE: Unveiling Caste Stereotypes in Large Language Models through Multi-Dimensional Bias Analysis 提出DECASTE框架以揭示大语言模型中的种姓偏见 large language model
19 Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples 提出LISTEN以解决音频感知大语言模型的幻觉问题 large language model
20 S2SBench: A Benchmark for Quantifying Intelligence Degradation in Speech-to-Speech Large Language Models 提出S2SBench以量化语音到语音大语言模型的智能退化问题 large language model
21 OmniGenBench: A Modular Platform for Reproducible Genomic Foundation Models Benchmarking 提出OmniGenBench以解决基因组基础模型评估的可重复性问题 foundation model
22 QA-prompting: Improving Summarization with Large Language Models using Question-Answering 提出QA-prompting以解决长文本摘要中的位置信息偏差问题 large language model
23 Cross-Lingual Optimization for Language Transfer in Large Language Models 提出跨语言优化方法以解决大语言模型语言迁移问题 large language model
24 Unraveling Interwoven Roles of Large Language Models in Authorship Privacy: Obfuscation, Mimicking, and Verification 提出统一框架分析大语言模型在作者隐私中的作用 large language model
25 Beyond Chains: Bridging Large Language Models and Knowledge Bases in Complex Question Answering 提出PDRR框架以解决复杂问答中的知识库整合问题 large language model
26 ShieldVLM: Safeguarding the Multimodal Implicit Toxicity via Deliberative Reasoning with LVLMs 提出ShieldVLM以解决多模态隐性毒性检测问题 multimodal
27 AUTOLAW: Enhancing Legal Compliance in Large Language Models via Case Law Generation and Jury-Inspired Deliberation 提出AutoLaw以解决法律合规性问题 large language model
28 Activation-Guided Consensus Merging for Large Language Models 提出激活引导共识合并以提升大语言模型的效率与稳定性 large language model
29 Mixed Signals: Understanding Model Disagreement in Multimodal Empathy Detection 提出多模态模型以解决同类信号冲突问题 multimodal
30 Informatics for Food Processing 提出FoodProX模型以解决食品加工分类的主观性问题 large language model multimodal
31 Amadeus-Verbo Technical Report: The powerful Qwen2.5 family models trained in Portuguese 提出Amadeus Verbo模型以促进巴西葡萄牙语的开放源代码发展 large language model foundation model
32 PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs 提出PersonaTAB以解决个性化对话系统缺乏个性标注的问题 large language model TAMP
33 Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst 提出自推理语言模型以提升复杂推理任务的性能 large language model chain-of-thought
34 Mapping the Minds of LLMs: A Graph-Based Analysis of Reasoning LLM 提出图基分析框架以提升大语言模型推理能力 large language model chain-of-thought
35 Too Long, Didn't Model: Decomposing LLM Long-Context Understanding With Novels 提出TLDM基准以评估LLM在长上下文理解中的表现 large language model
36 EasyMath: A 0-shot Math Benchmark for SLMs 提出EasyMath基准以评估小型语言模型的数学推理能力 chain-of-thought
37 Automated Journalistic Questions: A New Method for Extracting 5W1H in French 提出自动化提取法以解决法语新闻5W1H信息提取问题 large language model
38 UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models 提出UltraEdit以解决大规模语言模型的终身编辑问题 large language model
39 WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications 提出WirelessMathBench以评估LLMs在无线通信中的数学建模能力 large language model
40 Temporal Alignment of Time Sensitive Facts with Activation Engineering 提出激活工程以解决大语言模型的时间敏感性问题 large language model
41 Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability 研究量化对大语言模型可解释性与可理解性的影响 large language model
42 Mechanistic Interpretability of GPT-like Models on Summarization Tasks 提出机制可解释性框架以分析GPT模型在摘要任务中的表现 large language model
43 WebNovelBench: Placing LLM Novelists on the Web Novel Distribution 提出WebNovelBench以解决长篇小说生成评估问题 large language model
44 Creative Preference Optimization 提出创意偏好优化方法以提升LLM的创造力 large language model
45 MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language 提出MUG-Eval框架以评估多语言生成能力 large language model
46 GemMaroc: Unlocking Darija Proficiency in LLMs with Minimal Data 提出GemMaroc以解决摩洛哥阿拉伯语处理问题 large language model
47 Tokenization Constraints in LLMs: A Study of Symbolic and Arithmetic Reasoning Limits 提出Token Awareness以解决LLMs中的符号推理限制问题 chain-of-thought
48 A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations 提出PersonaConvBench以评估个性化对话生成能力 large language model
49 GloSS over Toxicity: Understanding and Mitigating Toxicity in LLMs via Global Toxic Subspace 提出GloSS以解决大语言模型中的毒性问题 large language model
50 From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora 提出多路平行语料库以提升多语言大模型性能 large language model
51 FlashThink: An Early Exit Method For Efficient Reasoning 提出FlashThink以解决大语言模型推理效率问题 large language model
52 EEG-to-Text Translation: A Model for Deciphering Human Brain Activity 提出R1 Translator以提升脑电图到文本翻译性能 large language model
53 ConspEmoLLM-v2: A robust and stable model to detect sentiment-transformed conspiracy theories 提出ConspEmoLLM-v2以解决情感转变阴谋论检测问题 large language model
54 Concept Incongruence: An Exploration of Time and Death in Role Playing 提出概念不一致性以分析角色扮演中的时间与死亡问题 large language model
55 Incorporating Token Usage into Prompting Strategy Evaluation 提出Big-$O_{tok}$框架以优化提示策略的效率评估 large language model
56 SEPS: A Separability Measure for Robust Unlearning in LLMs 提出SEPS框架以解决大语言模型的混合查询遗忘问题 large language model
57 Tracing Multilingual Factual Knowledge Acquisition in Pretraining 追踪多语言事实知识获取以提升语言模型的跨语言一致性 large language model
58 Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes 系统研究语言混合对推理语言模型的影响及优化策略 chain-of-thought
59 sudoLLM: On Multi-role Alignment of Language Models 提出sudoLLM以解决语言模型的多角色对齐问题 large language model
60 TRATES: Trait-Specific Rubric-Assisted Cross-Prompt Essay Scoring 提出TRATES以解决个体特征评估不足的问题 large language model
61 Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders 利用稀疏自编码器实现大型语言模型的去毒化 large language model
62 MoMoE: Mixture of Moderation Experts Framework for AI-Assisted Online Governance 提出MoMoE框架以解决在线社区内容审核透明性问题 large language model
63 Rank-K: Test-Time Reasoning for Listwise Reranking 提出Rank-K以解决多语言查询的高效重排序问题 large language model
64 From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning 研究空间推理中的指令泛化挑战 large language model
65 Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis 提出PhantomCircuit以解决知识遮蔽问题 large language model
66 Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs 提出有效的提示注入攻击以评估开源LLM的安全性 large language model
67 Dual Decomposition of Weights and Singular Value Low Rank Adaptation 提出DuDe以解决LoRA方法的训练不稳定和知识转移效率低的问题 large language model
68 OSoRA: Output-Dimension and Singular-Value Initialized Low-Rank Adaptation 提出OSoRA以解决大规模语言模型微调的计算资源挑战 large language model
69 Teaching Small Language Models to Learn Logic through Meta-Learning 通过元学习提升小型语言模型的逻辑推理能力 large language model
70 JOLT-SQL: Joint Loss Tuning of Text-to-SQL with Confusion-aware Noisy Schema Sampling 提出JOLT-SQL以解决文本到SQL映射中的噪声模式问题 large language model
71 Universal Acoustic Adversarial Attacks for Flexible Control of Speech-LLMs 提出通用声学对抗攻击以灵活控制语音大语言模型 large language model
72 ThinkSwitcher: When to Think Hard, When to Think Fast 提出ThinkSwitcher以解决大规模推理模型的计算效率问题 chain-of-thought
73 SlangDIT: Benchmarking LLMs in Interpretative Slang Translation 提出SlangDIT以解决俚语翻译中的语境依赖问题 large language model
74 The Strawberry Problem: Emergence of Character-level Understanding in Tokenized Language Models 提出轻量级架构改进以解决字符级理解问题 large language model
75 Legal Rule Induction: Towards Generalizable Principle Discovery from Analogous Judicial Precedents 提出法律规则诱导方法以解决从判例中提取隐性原则的问题 large language model
76 MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations 提出MultiHal以解决多语言知识图谱基础的LLM幻觉评估问题 large language model
77 BAR: A Backward Reasoning based Agent for Complex Minecraft Tasks 提出BAR代理以解决复杂Minecraft任务中的推理问题 large language model
78 Enhancing LLMs via High-Knowledge Data Selection 提出高知识评分器以解决LLMs知识稀缺问题 large language model
79 Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation 提出多模态检索增强生成的隐私漏洞分析方法 multimodal
80 Cross-Linguistic Transfer in Multilingual NLP: The Role of Language Families and Morphology 探讨语言家族与形态学在多语言NLP中的跨语言迁移作用 zero-shot transfer
81 Let's Verify Math Questions Step by Step 提出MathQ-Verify以解决数学问题验证的挑战 large language model
82 PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks 提出PandaGuard以系统评估LLM安全性应对越狱攻击 large language model
83 Improve Language Model and Brain Alignment via Associative Memory 通过联想记忆提升语言模型与大脑的对齐 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (24 篇)

#题目一句话要点标签🔗
84 Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models 提出MathIF基准以评估大规模推理模型的指令遵循能力 reinforcement learning large language model instruction following
85 Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning 提出CoT思想跃迁桥接任务以解决数学推理中的中间步骤缺失问题 reinforcement learning large language model chain-of-thought
86 Toward Effective Reinforcement Learning Fine-Tuning for Medical VQA in Vision-Language Models 提出强化学习微调方法以提升医学视觉问答性能 reinforcement learning large language model multimodal
87 FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation 提出FuxiMT以解决中文为中心的多语言机器翻译问题 curriculum learning large language model
88 Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning 提出Game-RL以提升视觉语言模型的推理能力 reinforcement learning multimodal
89 Interpretable Traces, Unexpected Outcomes: Investigating the Disconnect in Trace-Based Knowledge Distillation 提出基于规则分解的知识蒸馏方法以提升小型语言模型的可解释性 distillation chain-of-thought
90 Advancing Multi-Agent RAG Systems with Minimalist Reinforcement Learning 提出Mujica-MyGo框架以解决长上下文问题 reinforcement learning large language model
91 Reward Reasoning Model 提出奖励推理模型以提升奖励模型性能 reinforcement learning large language model chain-of-thought
92 General-Reasoner: Advancing LLM Reasoning Across All Domains 提出General-Reasoner以解决LLM推理能力不足问题 reinforcement learning large language model chain-of-thought
93 Context Reasoner: Incentivizing Reasoning Capability for Contextualized Privacy and Safety Compliance via Reinforcement Learning 提出Context Reasoner以解决LLMs的隐私与安全合规问题 reinforcement learning large language model
94 Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents 提出AgentGhost以解决MLLM驱动的GUI代理后门攻击问题 contrastive learning large language model multimodal
95 Improved Methods for Model Pruning and Knowledge Distillation 提出MAMA剪枝方法以解决模型剪枝性能下降问题 distillation large language model
96 InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion 提出InfiGFusion以解决异构模型融合中的语义依赖问题 distillation large language model
97 Think-J: Learning to Think for Generative LLM-as-a-Judge 提出Think-J以提升生成式LLM的评判能力 reinforcement learning offline RL large language model
98 KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation 提出KORGym以解决LLM推理评估的不足问题 reinforcement learning large language model
99 Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation 提出日志增强生成框架以提升模型推理能力 distillation large language model
100 Semi-Clairvoyant Scheduling of Speculative Decoding Requests to Minimize LLM Inference Latency 提出LAPS-SD以解决LLM推理延迟问题 SSM large language model
101 Think Only When You Need with Large Hybrid-Reasoning Models 提出大型混合推理模型以提高推理效率和准确性 reinforcement learning large language model
102 Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning 提出Prune-on-Logic框架以提升长链思维推理效果 distillation chain-of-thought
103 Adapting Pretrained Language Models for Citation Classification via Self-Supervised Contrastive Learning 提出Citss框架以解决学术引用分类中的数据稀缺问题 contrastive learning
104 Not All Correct Answers Are Equal: Why Your Distillation Source Matters 通过高质量蒸馏数据提升语言模型推理能力 distillation
105 FAID: Fine-Grained AI-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning 提出FAID框架以解决AI生成文本的细粒度检测问题 contrastive learning
106 DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models 提出DRP以解决大型推理模型的效率问题 distillation chain-of-thought
107 Truth or Twist? Optimal Model Selection for Reliable Label Flipping Evaluation in LLM-based Counterfactuals 提出优化模型选择方法以提高LLM反事实评估的可靠性 distillation large language model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
108 Chain-of-Thought Driven Adversarial Scenario Extrapolation for Robust Language Models 提出对抗场景外推方法以增强语言模型的鲁棒性 ASE large language model chain-of-thought

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
109 Not Minds, but Signs: Reframing LLMs through Semiotics 通过符号学视角重构大型语言模型的理解方式 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页