cs.AI(2026-05-14)

📊 共 56 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (37 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (14 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (37 篇)

#题目一句话要点标签🔗
1 To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model MMGuard:通过对抗性扰动保护多模态数据免受未经授权的LVLM微调 multimodal
2 SemaTune: Semantic-Aware Online OS Tuning with Large Language Models SemaTune:基于大语言模型的语义感知在线操作系统调优框架 large language model
3 KGPFN: Unlocking the Potential of Knowledge Graph Foundation Model via In-Context Learning 提出KGPFN,通过上下文学习释放知识图谱基础模型的潜力 foundation model
4 MediaClaw: Multimodal Intelligent-Agent Platform Technical Report MediaClaw:多模态智能体平台,解决AIGC部署中的碎片化和流程断连问题 multimodal
5 APWA: A Distributed Architecture for Parallelizable Agentic Workflows APWA:一种用于并行化Agent工作流的分布式架构 large language model
6 Dual-Dimensional Consistency: Balancing Budget and Quality in Adaptive Inference-Time Scaling 提出双维度一致性(DDC)框架,平衡LLM推理加速中的预算与质量。 large language model
7 SpeakerLLM: A Speaker-Specialized Audio-LLM for Speaker Understanding and Verification Reasoning 提出SpeakerLLM:一个面向说话人理解和验证推理的说话人专用音频LLM large language model
8 Small, Private Language Models as Teammates for Educational Assessment Design 利用小型私有语言模型作为队友,辅助教育评估设计 large language model
9 A Deterministic Agentic Workflow for HS Tariff Classification: Multi-Dimensional Rule Reasoning with Interpretable Decisions 提出确定性Agent工作流,解决HS编码多维度规则推理难题,实现可解释的关税分类。 large language model
10 AI Outperforms Humans in Personalized Image Aesthetics Assessment via LLM-Based Interviews and Semantic Feature Extraction 提出基于LLM访谈和语义特征提取的AI个性化图像美学评估系统,超越人类表现 large language model
11 SWE-Chain: Benchmarking Coding Agents on Chained Release-Level Package Upgrades SWE-Chain:用于评估代码智能体在链式发布级软件包升级任务上的性能基准 large language model
12 Hypergraph Enterprise Agentic Reasoner over Heterogeneous Business Systems 提出HEAR:基于分层超图的企业智能Agent,解决复杂业务系统中多跳推理难题 large language model
13 TeachAnything: A Multimodal Crowdsourcing Platform for Training Embodied AI Agents in Symmetrical Reality 提出TeachAnything平台,用于在对称现实中训练具身智能体 embodied AI multimodal
14 Zero-Shot Goal Recognition with Large Language Models 利用大型语言模型实现零样本目标识别,探索其规划知识 large language model
15 Teaching Large Language Models When Not to Know: Learning Temporal Critique for Ex-Ante Reasoning 提出TCFT框架,提升大语言模型在时序推理中对时间截断点的感知能力 large language model
16 SliceGraph: Mapping Process Isomers in Multi-Run Chain-of-Thought Reasoning 提出SliceGraph以分析多轮CoT推理中过程同分异构体,揭示中间计算共享、分裂和重组的模式。 chain-of-thought
17 Complacent, Not Sycophantic: Reframing Large Language Models and Designing AI Literacy for Complacent Machines 重新定义大语言模型:从谄媚到顺从,并为顺从型机器设计AI素养教育 large language model
18 Contestable Multi-Agent Debate with Arena-based Argumentative Computation for Multimedia Verification 提出基于竞技场论证计算的多智能体辩论框架,用于多媒体验证 large language model multimodal
19 OmniDrop: Layer-wise Token Pruning for Omni-modal LLMs via Query-Guidance OmniDrop:提出一种基于查询引导的层级Token剪枝方法,用于优化Omni-modal LLM。 large language model multimodal
20 Stateful Reasoning via Insight Replay 提出InsightReplay,解决长链CoT推理中关键信息遗忘问题 large language model chain-of-thought
21 DVMap: Fine-Grained Pluralistic Value Alignment via High-Consensus Demographic-Value Mapping DVMap:通过高共识人口统计-价值映射实现细粒度多元价值对齐 large language model chain-of-thought
22 One Step to the Side: Why Defenses Against Malicious Finetuning Fail Under Adaptive Adversaries 揭示恶意微调防御的脆弱性:提出自适应攻击破解现有防御机制 foundation model
23 Uncovering the Representation Geometry of Minimal Cores in Overcomplete Reasoning Traces 揭示过完备推理轨迹中最小核心的表征几何特性,实现推理过程压缩与提纯。 chain-of-thought
24 Runtime-Structured Task Decomposition for Agentic Coding Systems 提出运行时结构化任务分解,提升Agentic编码系统效率与可靠性 large language model
25 Amortized Energy-Based Bayesian Inference 提出基于能量的摊销贝叶斯推断方法,加速非线性反问题求解。 multimodal
26 Hidden in Memory: Sleeper Memory Poisoning in LLM Agents 提出“沉睡记忆中毒”攻击,揭示LLM Agent长期记忆的安全风险 large language model
27 $π$-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows 提出$π$-Bench,用于评估个人助理Agent在长程工作流中的主动性。 large language model
28 Agentic Design of Compositional Descriptors via Autoresearch for Materials Science Applications 提出Automat框架,利用自主研究设计化学描述符,提升材料性质预测精度。 large language model
29 Prompt Segmentation and Annotation Optimisation: Controlling LLM Behaviour via Optimised Segment-Level Annotations 提出PSAO框架,通过优化分段Prompt标注提升LLM控制力和效率 large language model
30 Cattle Trade: A Multi-Agent Benchmark for LLM Bluffing, Bidding, and Bargaining 提出Cattle Trade多智能体基准,用于评估LLM在策略推理、博弈和议价中的能力。 large language model
31 Deepchecks: Evaluating Retrieval-Augmented Generation (RAG) Deepchecks:用于评估检索增强生成(RAG)系统的全面框架 large language model
32 BEAM: Binary Expert Activation Masking for Dynamic Routing in MoE 提出BEAM:一种二元专家激活掩码方法,用于MoE中的动态路由,提升推理效率。 large language model
33 Correctness-Aware Repository Filtering Under Maximum Effective Context Window Constraints 提出一种面向大语言模型代码工具的、在有效上下文窗口约束下的正确性感知仓库过滤框架。 large language model
34 Hydra: Efficient, Correct Code Generation via Checkpoint-and-Rollback Support Hydra:通过检查点与回滚支持实现高效、正确的代码生成 large language model
35 Watermarking Game-Playing Agents in Perfect-Information Extensive-Form Games 提出博弈策略水印方法,用于检测完美信息扩展式博弈中AI作弊行为 large language model
36 Agentic AI Ecosystems in Higher Education: A Perspective on AI Agents to Emerging Inclusive, Agentic Multi-Agent AI Framework for Learning, Teaching and Institutional Intelligence 提出面向高等教育的Agentic多智能体AI框架,以支持包容性学习和机构智能 multimodal
37 Good to Go: The LOOP Skill Engine That Hits 99% Success and Slashes Token Usage by 99% via One-Shot Recording and Deterministic Replay LOOP Skill Engine:通过一次记录和确定性回放,实现99%成功率并降低99% Token消耗 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
38 Agentifying Patient Dynamics within LLMs through Interacting with Clinical World Model 提出SepsisAgent以优化重症监护室脓毒症管理 reinforcement learning behavior cloning world model
39 Case-Based Calibration of Adaptive Reasoning and Execution for LLM Tool Use 提出CAST框架,利用案例校准LLM工具使用的自适应推理与执行 reinforcement learning reward design large language model
40 IFPV: An Integrated Multi-Agent Framework for Generative Operational Planning and High-Fidelity Plan Verification 提出IFPV框架,解决复杂战场环境下作战规划生成与验证难题 world model world models large language model
41 Probabilistic Verification of Recurrent Neural Networks for Single and Multi-Agent Reinforcement Learning 提出RNN-ProVe,用于强化学习中RNN策略的概率验证 reinforcement learning
42 Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution Solvita:通过Agent进化增强大型语言模型在编程竞赛中的能力 reinforcement learning large language model
43 Prompting Policies for Multi-step Reasoning and Tool-Use in Black-box LLMs with Iterative Distillation of Experience 提出基于经验迭代蒸馏的提示策略,提升黑盒LLM在复杂推理和工具使用任务中的性能。 reinforcement learning distillation large language model
44 LEMON: Learning Executable Multi-Agent Orchestration via Counterfactual Reinforcement Learning LEMON:通过反事实强化学习学习可执行的多智能体编排 reinforcement learning large language model
45 Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning 提出Darwin Family,通过无训练的演化合并提升语言模型推理能力。 Mamba large language model foundation model
46 GenCircuit-RL: Reinforcement Learning from Hierarchical Verification for Genetic Circuit Design 提出GenCircuit-RL,利用分层验证强化学习进行基因电路设计。 reinforcement learning curriculum learning
47 Coding Agent Is Good As World Simulator 提出基于代码生成代理的物理世界建模框架,提升交互式模拟环境的物理真实性。 world model world models physically plausible
48 ASH: Agents that Self-Hone via Embodied Learning ASH:通过具身学习进行自我提升的智能体,解决长时程任务难题 reward shaping foundation model
49 A Unified Knowledge Embedded Reinforcement Learning-based Framework for Generalized Capacitated Vehicle Routing Problems 提出知识嵌入强化学习框架以解决广义容量车辆路径问题 reinforcement learning
50 MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning MetaAgent-X:通过端到端强化学习打破自动多智能体系统的性能瓶颈 reinforcement learning
51 Efficient Generative Retrieval for E-commerce Search with Semantic Cluster IDs and Expert-Guided RL 针对电商搜索,提出基于语义簇ID和专家引导强化学习的高效生成式检索框架。 reinforcement learning contrastive learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
52 When Robots Do the Chores: A Benchmark and Agent for Long-Horizon Household Task Execution 提出LongAct基准与HoloMind智能体,用于评估和提升机器人长时程家庭任务执行能力 manipulation world model world models
53 From LLM-Generated Conjectures to Lean Formalizations: Automated Polynomial Inequality Proving via Sum-of-Squares Certificates 提出NSPI框架,结合LLM与符号计算,实现可验证的多项式不等式自动证明 manipulation

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
54 VerbalValue: A Socially Intelligent Virtual Host for Sales-Driven Live Commerce VerbalValue:面向销售的直播电商社交智能虚拟主持人 HuMoR large language model

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
55 Beyond Partner Diversity: An Influence-Based Team Steering Framework for Zero-Shot Human-Machine Teaming 提出基于影响力的团队引导框架,用于零样本人机协作。 dyadic interaction

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
56 SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents SimPersona:从原始点击流学习离散买家画像,用于具身电商Agent VQ-VAE

⬅️ 返回 cs.AI 首页 · 🏠 返回主页