cs.CL(2025-10-29)

📊 共 43 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (34 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (7 🔗2) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (34 篇)

#题目一句话要点标签🔗
1 GAPMAP: Mapping Scientific Knowledge Gaps in Biomedical Literature Using Large Language Models 提出GAPMAP以识别生物医学文献中的知识缺口 large language model
2 A Survey on Unlearning in Large Language Models 针对大型语言模型,提出基于干预阶段分类的全面性卸载学习综述 large language model
3 The Collective Turing Test: Large Language Models Can Generate Realistic Multi-User Discussions 大型语言模型生成的多用户讨论具有高度真实性,可用于模拟在线社区。 large language model
4 FlowMM: Cross-Modal Information Flow Guided KV Cache Merging for Efficient Multimodal Context Inference FlowMM:跨模态信息流引导的KV缓存融合,提升多模态上下文推理效率 multimodal
5 MCP4IFC: IFC-Based Building Design Using Large Language Models MCP4IFC:利用大型语言模型驱动的IFC建筑设计框架 large language model
6 SymCode: A Neurosymbolic Approach to Mathematical Reasoning via Verifiable Code Generation SymCode:一种基于可验证代码生成的神经符号数学推理方法 large language model chain-of-thought
7 DiagramEval: Evaluating LLM-Generated Diagrams via Graphs DiagramEval:提出基于图结构的LLM生成图表评估方法 large language model multimodal
8 TextualVerifier: Verify TextGrad Step-by-Step 提出TextualVerifier,为TextGrad提供基于LLM的文本推理验证框架 large language model chain-of-thought
9 A Critical Study of Automatic Evaluation in Sign Language Translation 针对手语翻译, критически 评估现有自动评估指标的局限性。 large language model multimodal
10 Parrot: A Training Pipeline Enhances Both Program CoT and Natural Language CoT for Reasoning Parrot:一种训练流程,增强程序CoT和自然语言CoT的数学推理能力 large language model chain-of-thought
11 Testing Cross-Lingual Text Comprehension In LLMs Using Next Sentence Prediction 利用下句预测任务评估大语言模型在跨语言文本理解中的能力 large language model chain-of-thought
12 GMTRouter: Personalized LLM Router over Multi-turn User Interactions GMTRouter:基于多轮用户交互的个性化LLM路由方法 large language model
13 LISTEN to Your Preferences: An LLM Framework for Multi-Objective Selection LISTEN:利用LLM进行多目标选择的框架,解决专家偏好形式化难题 large language model
14 TOPol: Capturing and Explaining Multidimensional Semantic Polarity Fields and Vectors TOPol:提出一种捕捉和解释多维语义极性场和向量的半监督框架 large language model
15 Can LLMs Estimate Cognitive Complexity of Reading Comprehension Items? 利用大型语言模型评估阅读理解题目的认知复杂度 large language model
16 Revisiting Multilingual Data Mixtures in Language Model Pretraining 研究多语言数据混合对预训练语言模型的影响,挑战多语言学习的固有认知。 large language model
17 RECAP: Reproducing Copyrighted Data from LLMs Training with an Agentic Pipeline 提出RECAP,通过Agent协作从LLM中提取并验证版权数据记忆 large language model
18 The Limits of Obliviate: Evaluating Unlearning in LLMs via Stimulus-Knowledge Entanglement-Behavior Framework 提出SKeB框架,评估LLM在诱导性提示下的遗忘能力,揭示模型大小与遗忘效果的关联。 large language model
19 Knowledge Graph Analysis of Legal Understanding and Violations in LLMs 提出知识图谱增强的RAG方法,评估LLM在法律理解和违规行为方面的能力。 large language model
20 Evaluating the Role of Verifiers in Test-Time Scaling for Legal Reasoning Tasks 研究验证器在法律推理任务测试时缩放中的作用,提升大语言模型性能。 large language model
21 Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry 针对信息不对称下的LLM智能体协作,提出通信与验证框架以提升任务完成度和可解释性。 large language model
22 TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation TwinVoice:通过LLM角色模拟构建数字孪生的多维度基准测试 large language model
23 Depth and Autonomy: A Framework for Evaluating LLM Applications in Social Science Research 提出基于深度和自主性的LLM应用评估框架,提升社科研究可靠性。 large language model
24 RLMEval: Evaluating Research-Level Neural Theorem Proving RLMEval:提出用于评估研究级神经定理证明的新基准 large language model
25 Implicature in Interaction: Understanding Implicature Improves Alignment in Human-LLM Interaction 通过理解会话含义提升人机交互中LLM的对齐效果 large language model
26 Serve Programs, Not Prompts 提出Symphony系统,通过服务LLM推理程序提升LLM服务效率与灵活性 large language model
27 BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains 提出BhashaBench V1,用于评估LLM在印度特定领域的性能 large language model
28 Hallucinations in Bibliographic Recommendation: Citation Frequency as a Proxy for Training Data Redundancy 利用引用频率作为训练数据冗余的代理,研究LLM在文献推荐中的幻觉问题。 large language model
29 Monitoring Transformative Technological Convergence Through LLM-Extracted Semantic Entity Triple Graphs 提出一种基于LLM抽取语义三元组图的科技融合监测方法。 large language model
30 Not ready for the bench: LLM legal interpretation is unstable and out of step with human judgments LLM法律解释不稳定且与人类判断不一致,不适用于法律实践 large language model
31 Beyond One-Size-Fits-All: Personalized Harmful Content Detection with In-Context Learning 提出基于上下文学习的个性化有害内容检测框架,提升用户定制化和隐私保护。 foundation model
32 ProMediate: A Socio-cognitive framework for evaluating proactive agents in multi-party negotiation ProMediate:用于评估多方协商中主动代理的社会认知框架 large language model
33 Ideology-Based LLMs for Content Moderation 研究表明,基于意识形态的角色扮演会使LLM在内容审核中产生偏差。 large language model
34 Model-Document Protocol for AI Search 提出模型-文档协议(MDP)框架,提升LLM在AI搜索中的知识利用效率 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)

#题目一句话要点标签🔗
35 Teaching Sarcasm: Few-Shot Multimodal Sarcasm Detection via Distillation to a Parameter-Efficient Student 提出PEKD框架,通过知识蒸馏提升小样本多模态讽刺检测的参数高效微调性能。 distillation multimodal
36 PairUni: Pairwise Training for Unified Multimodal Language Models PairUni:通过成对训练统一多模态语言模型,平衡理解与生成任务。 reinforcement learning policy learning multimodal
37 A Survey on Efficient Large Language Model Training: From Data-centric Perspectives 综述:数据中心视角下高效大语言模型训练方法研究 distillation large language model
38 Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning 提出监督强化学习(SRL),解决小模型多步推理难题,提升软件工程任务性能。 reinforcement learning large language model
39 Diverse Preference Learning for Capabilities and Alignment 提出软偏好学习(Soft Preference Learning)以提升LLM能力、对齐性和输出多样性 preference learning RLHF DPO
40 PORTool: Tool-Use LLM Training with Rewarded Tree PORTool:基于奖励树的工具使用LLM训练方法,提升动态环境下的工具调用性能。 reinforcement learning large language model
41 DEBATE: A Large-Scale Benchmark for Role-Playing LLM Agents in Multi-Agent, Long-Form Debates DEBATE:一个大规模基准,用于评估多智能体、长程辩论中角色扮演LLM智能体的行为真实性 DPO direct preference optimization

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
42 Scaling Latent Reasoning via Looped Language Models Ouro:通过循环语言模型和隐空间推理提升LLM性能 manipulation chain-of-thought
43 Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs 提出激活空间人格操控方法,通过混合层选择实现LLM中稳定的人格特质控制。 manipulation large language model

⬅️ 返回 cs.CL 首页 · 🏠 返回主页