cs.AI(2026-05-13)

📊 共 27 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (15) 支柱二:RL算法与架构 (RL & Architecture) (7 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (15 篇)

#题目一句话要点标签🔗
1 MMSkills: Towards Multimodal Skills for General Visual Agents MMSkills:面向通用视觉Agent的多模态技能框架,提升决策能力 multimodal visual grounding
2 (How) Do Large Language Models Understand High-Level Message Sequence Charts? 评估大型语言模型对高层消息序列图语义的理解能力 large language model
3 Assessing the Creativity of Large Language Models: Testing, Limits, and New Frontiers 评估大语言模型创造力:测试、局限与新方向,提出DRAT有效预测科学构思能力 large language model
4 AI Harness Engineering: A Runtime Substrate for Foundation-Model Software Agents 提出AI Harness Engineering,提升基础模型在软件工程中的可靠性 foundation model
5 Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis 提出紧凑潜在流形转换(CLMT),用于生理信号跨模态和跨频率合成,实现边缘设备部署。 foundation model
6 Multimodal Hidden Markov Models for Persistent Emotional State Tracking 提出基于多模态隐马尔可夫模型的持续情感状态追踪框架,用于理解对话情感弧。 multimodal
7 Senses Wide Shut: A Representation-Action Gap in Omnimodal LLMs 揭示全模态大语言模型中表征与行动之间的差距,并提出探针引导的logit调整方法。 large language model multimodal
8 ScioMind: Cognitively Grounded Multi-Agent Social Simulation with Anchoring-Based Belief Dynamics and Dynamic Profiles ScioMind:基于认知的社会模拟框架,提升LLM驱动的多智能体系统行为真实性 large language model
9 Identifying AI Web Scrapers Using Canary Tokens 提出基于Canary Token的AI网络爬虫识别方法,解决LLM训练数据来源追踪难题 large language model
10 The Readability Spectrum: Patterns, Issues, and Prompt Effects in LLM-Generated Code 构建代码可读性评估模型,揭示LLM生成代码的可读性模式与影响因素 large language model
11 It's not the Language Model, it's the Tool: Deterministic Mediation for Scientific Workflows 提出确定性中介模式,利用语言模型编排确定性工具,解决科学工作流中结果不可复现问题。 foundation model
12 Retrieval-Augmented Tutoring for Algorithm Tracing and Problem-Solving in AI Education 提出KITE:一种基于检索增强的算法辅导系统,用于算法追踪和问题解决 multimodal
13 When Attention Closes: How LLMs Lose the Thread in Multi-Turn Interaction 提出Goal Accessibility Ratio (GAR)诊断LLM在多轮交互中丢失上下文的机制,揭示注意力机制失效后的信息残留。 large language model
14 Beyond Cooperative Simulators: Generating Realistic User Personas for Robust Evaluation of LLM Agents 提出Persona Policies (PPol)以生成更真实的用户角色,提升LLM Agent的鲁棒性。 large language model
15 Quantifying LLM Safety Degradation Under Repeated Attacks Using Survival Analysis 提出基于生存分析的LLM安全性评估框架,量化重复攻击下的安全性降级 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)

#题目一句话要点标签🔗
16 D-VLA: A High-Concurrency Distributed Asynchronous Reinforcement Learning Framework for Vision-Language-Action Models D-VLA:用于视觉-语言-动作模型的高并发分布式异步强化学习框架 reinforcement learning embodied AI vision-language-action
17 An Agentic AI Framework with Large Language Models and Chain-of-Thought for UAV-Assisted Logistics Scheduling with Mobile Edge Computing 提出基于Agentic AI和分层强化学习的无人机辅助物流调度框架,解决物理物流与计算任务耦合难题。 reinforcement learning deep reinforcement learning PPO
18 Respecting Self-Uncertainty in On-Policy Self-Distillation for Efficient LLM Reasoning 提出EGRSD和CL-EGRSD,通过熵引导的自蒸馏提升LLM推理效率 teacher-student distillation chain-of-thought
19 Embodied Multi-Agent Coordination by Aligning World Models Through Dialogue 通过对话对齐世界模型,实现具身多智能体协同 world model world models
20 PROMETHEUS: Automating Deep Causal Research Integrating Text, Data and Models PROMETHEUS:自动化深度因果研究,整合文本、数据与模型 world model world models large language model
21 ChipMATE: Multi-Agent Training via Reinforcement Learning for Enhanced RTL Generation ChipMATE:首个自训练多智能体RTL生成框架,通过强化学习提升代码质量。 reinforcement learning
22 Improving Code Translation with Syntax-Guided and Semantic-aware Preference Optimization 提出CTO,利用语法引导和语义感知偏好优化提升代码翻译质量 direct preference optimization contrastive learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
23 Ego2World: Compiling Egocentric Cooking Videos into Executable Worlds for Belief-State Planning Ego2World:将第一人称烹饪视频编译为可执行世界,用于信念状态规划 sim-to-real egocentric
24 Humanwashing -- It Should Leave You Feeling Dirty 揭示“人机回路”概念滥用,警惕“人洗”现象掩盖AI决策系统风险 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
25 Discrete Diffusion for Complex and Congested Multi-Agent Path Finding with Sparse Social Attention DiffLNS:融合离散扩散与稀疏注意力机制,解决复杂拥堵环境下的多智能体路径规划问题 spatiotemporal multimodal

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
26 Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling 提出一种简单统一的扩展方法,使推理模型达到奥林匹克竞赛金牌水平。 IMoS

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
27 MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning 提出MAP框架,解决交互式Agent长程推理中环境理解不足的问题 affordance

⬅️ 返回 cs.AI 首页 · 🏠 返回主页