cs.CL(2025-02-27)

📊 共 45 篇论文 | 🔗 12 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (37 🔗12) 支柱二:RL算法与架构 (RL & Architecture) (5) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱八:物理动画 (Physics-based Animation) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (37 篇)

#题目一句话要点标签🔗
1 Protecting multimodal large language models against misleading visualizations 提出六种方法以提高多模态大语言模型对误导性可视化的鲁棒性 large language model multimodal
2 A Thousand Words or An Image: Studying the Influence of Persona Modality in Multimodal LLMs 研究人物角色模态对多模态大语言模型表达能力的影响,揭示图像模态的局限性。 large language model multimodal
3 Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge 提出层感知任务算术(LATA),解耦任务特定知识和指令遵循知识,提升模型合并与编辑效果。 large language model instruction following
4 Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking 研究Transformer+CoT在有限状态自动机中的状态跟踪能力,揭示其内部机制。 large language model chain-of-thought
5 Self-Training Elicits Concise Reasoning in Large Language Models 自训练方法引导大语言模型进行更简洁的推理,降低计算成本 large language model chain-of-thought
6 MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge 提出MMKE-Bench:一个用于评估多模态模型视觉知识编辑能力的综合基准。 large language model multimodal
7 Chitranuvad: Adapting Multi-Lingual LLMs for Multimodal Translation Chitranuvad:通过多语言LLM适配实现多模态翻译 multimodal
8 NANOGPT: A Query-Driven Large Language Model Retrieval-Augmented Generation System for Nanotechnology Research 提出NANOGPT:一个查询驱动的LLM-RAG系统,用于加速纳米技术研究。 large language model
9 Re-evaluating Open-ended Evaluation of Large Language Models 提出基于三方博弈的LLM开放式评估方法,提升冗余数据下的鲁棒性 large language model
10 Erasing Without Remembering: Implicit Knowledge Forgetting in Large Language Models 提出PerMU,通过概率扰动实现大语言模型中更广义的隐式知识遗忘。 large language model
11 Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models 揭示LLM涌现抽象推理能力:一种涌现的符号机制 large language model
12 Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents 提出Collab-Overcooked基准测试,用于评估LLM在协作环境中的智能体能力 large language model
13 Collaborative Stance Detection via Small-Large Language Model Consistency Verification 提出CoVer框架,通过大小语言模型一致性验证提升社交媒体立场检测效率。 large language model
14 KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model 提出KEDRec-LM,一种知识蒸馏的可解释药物推荐大语言模型,并构建expRxRec数据集。 large language model
15 LinguaLens: Towards Interpreting Linguistic Mechanisms of Large Language Models via Sparse Auto-Encoder LinguaLens:通过稀疏自编码器解析大型语言模型的语言机制 large language model
16 ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models 提出ChineseEcomQA,一个可扩展的电商概念评估基准,用于评估大型语言模型在电商领域的性能。 large language model
17 Mapping Trustworthiness in Large Language Models: A Bibliometric Analysis Bridging Theory to Practice 通过文献计量分析,揭示大型语言模型可信度理论与实践的差距及提升策略。 large language model
18 GeoEdit: Geometric Knowledge Editing for Large Language Models 提出GeoEdit,利用几何知识编辑大型语言模型,提升知识更新效果并保持通用性。 large language model
19 HaLoRA: Hardware-aware Low-Rank Adaptation for Large Language Models Based on Hybrid Compute-in-Memory Architecture 提出HaLoRA,一种硬件感知的低秩适应方法,提升LLM在混合存内计算架构上的鲁棒性。 large language model
20 Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents ViSA:基于智能体协作的视觉中心数据选择方法,提升多模态大模型性能 large language model multimodal
21 PolyPrompt: Automating Knowledge Extraction from Multilingual Language Models with Dynamic Prompt Generation PolyPrompt:通过动态Prompt生成,自动化多语言模型中的知识提取。 large language model
22 LLM as a Broken Telephone: Iterative Generation Distorts Information 研究表明LLM迭代生成会扭曲信息,类似“传话游戏”效应,提示工程可缓解。 large language model
23 Deterministic or probabilistic? The psychology of LLMs as random number generators 揭示LLM生成随机数时的确定性偏差,源于训练数据中的人类认知偏见。 large language model
24 Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation 提出基于顺序增强的LLM训练方法,提升模型逻辑推理能力。 large language model
25 Foot-In-The-Door: A Multi-turn Jailbreak for LLMs 提出FITD多轮jailbreak方法,利用心理学原理提升LLM攻击成功率 large language model
26 KunlunBaize: LLM with Multi-Scale Convolution and Multi-Token Prediction Under TransformerX Framework KunlunBaize:TransformerX框架下多尺度卷积与多Token预测的大语言模型 large language model
27 Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing 提出Multi2框架,通过测试时扩展提升多文档摘要生成质量并探索其边界。 large language model
28 OmniRouter: Budget and Performance Controllable Multi-LLM Routing OmniRouter:提出预算和性能可控的多LLM路由框架,优化资源分配。 large language model
29 HuAMR: A Hungarian AMR Parser and Dataset 提出HuAMR:首个匈牙利语AMR数据集与解析器,填补非英语语义资源空白。 large language model
30 Supervised Fine-Tuning LLMs to Behave as Pedagogical Agents in Programming Education 提出GuideLM:通过监督微调LLM,使其在编程教育中作为教学助手 large language model
31 RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding RAPID:检索增强推测解码加速长文本LLM推理并提升生成质量 large language model
32 Beyond Natural Language Perplexity: Detecting Dead Code Poisoning in Code Generation Datasets 提出DePA,通过行级困惑度分析检测代码生成数据集中存在的死代码污染问题 large language model
33 FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving FINEREASON:通过反思性解谜评估和提升LLM的审慎推理能力 large language model
34 LongRoPE2: Near-Lossless LLM Context Window Scaling LongRoPE2:通过进化搜索和混合训练实现LLM近乎无损的上下文窗口扩展 large language model
35 The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs 揭示LLM算术能力局限:单步预测限制多操作数加法 large language model
36 What's Not Said Still Hurts: A Description-Based Evaluation Framework for Measuring Social Bias in LLMs 提出基于描述的偏见基准DBB,评估LLM在微妙语境下的社会偏见 large language model
37 Unsupervised Concept Vector Extraction for Bias Control in LLMs 提出一种无监督概念向量提取方法,用于控制大型语言模型中的偏见。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
38 SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models SEKI:基于大语言模型的自进化与知识启发式神经架构搜索 distillation large language model chain-of-thought
39 Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners 利用蒸馏推理器扩展计算资源,提升LLM在数学推理任务上的效率与性能。 Mamba distillation large language model
40 R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning R1-T1:通过推理学习,充分激发LLM在机器翻译中的能力 reinforcement learning large language model chain-of-thought
41 Preference Learning Unlocks LLMs' Psycho-Counseling Skills 提出PsychoCounsel-Preference数据集,提升LLM心理咨询能力 preference learning large language model
42 Few-Shot, No Problem: Descriptive Continual Relation Extraction 提出一种基于描述的持续关系抽取方法,解决少样本场景下的灾难性遗忘问题。 representation learning large language model

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
43 Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs 通过小规模人工对齐,LLM在专家级幽默排序任务中达到卓越性能 HuMoR large language model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
44 TripCraft: A Benchmark for Spatio-Temporally Fine Grained Travel Planning TripCraft:提出一个时空细粒度的旅行规划基准,解决现有基准的局限性。 spatiotemporal large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
45 Among Them: A game-based framework for assessing persuasion capabilities of LLMs 提出基于“Among Us”游戏的框架,评估LLM的说服能力。 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页