cs.AI(2026-04-16)

📊 共 51 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (33 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (11 🔗3) 支柱一:机器人控制 (Robot Control) (3 🔗1) 支柱六:视频提取与匹配 (Video Extraction) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (33 篇)

#题目一句话要点标签🔗
1 Predicting Power-System Dynamic Trajectories with Foundation Models 提出LASS-ODE-Power,利用大规模预训练预测电力系统动态轨迹。 foundation model
2 Intermediate Layers Encode Optimal Biological Representations in Single-Cell Foundation Models 单细胞Foundation模型中,中间层编码了最优的生物学表征,超越了传统末层特征提取方法。 foundation model
3 Dissecting Failure Dynamics in Large Language Model Reasoning 提出GUARD框架,通过不确定性信号探测并纠正大语言模型推理过程中的早期错误。 large language model
4 Rethinking Patient Education as Multi-turn Multi-modal Interaction 提出MedImageEdu基准,用于评估多模态交互式患者教育智能体 multimodal visual grounding
5 MirrorBench: Evaluating Self-centric Intelligence in MLLMs by Introducing a Mirror MirrorBench:通过引入镜像评估多模态大语言模型中的自我中心智能 large language model multimodal
6 DR$^{3}$-Eval: Towards Realistic and Reproducible Deep Research Evaluation 提出DR³-Eval:一个用于评估深度研究Agent的现实且可复现的基准 multimodal instruction following
7 Context Over Content: Exposing Evaluation Faking in Automated Judges 揭示LLM评估中的情境偏见:下游影响信息会扭曲评估结果 chain-of-thought
8 AIPC: Agent-Based Automation for AI Model Deployment with Qualcomm AI Runtime AIPC:基于Agent的AI模型自动化部署框架,加速高通AI Runtime部署。 multimodal
9 VeriGraphi: A Multi-Agent Framework of Hierarchical RTL Generation for Large Hardware Designs 提出VeriGraphi框架,解决LLM生成大型分层硬件设计Verilog代码的难题 large language model
10 Scepsy: Serving Agentic Workflows Using Aggregate LLM Pipelines Scepsy:利用聚合LLM流水线服务Agentic工作流,提升吞吐量并降低延迟 large language model
11 Autonomous Evolution of EDA Tools: Multi-Agent Self-Evolved ABC 提出自我进化的逻辑综合框架以提升EDA工具性能 large language model
12 From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench 提出ProVoice-Bench,用于评估主动式语音代理,填补现有基准测试的空白。 multimodal
13 Dr.~RTL: Autonomous Agentic RTL Optimization through Tool-Grounded Self-Improvement Dr.~RTL:通过工具驱动的自主Agent持续优化RTL设计 large language model
14 Discovering Novel LLM Experts via Task-Capability Coevolution 提出AC/DC框架,通过任务-能力协同进化发现具备新技能且更高效的LLM。 large language model
15 Governing Reflective Human-AI Collaboration: A Framework for Epistemic Scaffolding and Traceable Reasoning 提出一种人机协作框架,通过知识支架和可追溯推理提升AI治理能力 large language model
16 MemoSight: Unifying Context Compression and Multi Token Prediction for Reasoning Acceleration MemoSight:融合上下文压缩与多Token预测,加速LLM推理 chain-of-thought
17 The Missing Knowledge Layer in AI: A Framework for Stable Human-AI Reasoning 提出稳定人机推理框架,解决大语言模型推理漂移问题 large language model
18 The LLM Fallacy: Misattribution in AI-Assisted Cognitive Workflows 揭示LLM谬误:AI辅助认知工作流中的能力误判现象 large language model
19 Bounded Autonomy for Enterprise AI: Typed Action Contracts and Consumer-Side Execution 提出面向企业AI的Bounded Autonomy架构,保障LLM安全执行并提升效率。 large language model
20 The Agentification of Scientific Research: A Physicist's Perspective AI Agent赋能科研:从工具到合作者,重塑科学研究范式 large language model
21 HWE-Bench: Benchmarking LLM Agents on Real-World Hardware Bug Repair Tasks HWE-Bench:首个面向真实硬件缺陷修复任务的大规模LLM Agent基准测试 large language model
22 Learning to Draw ASCII Improves Spatial Reasoning in Language Models 通过学习绘制ASCII图提升语言模型空间推理能力 large language model
23 El Agente Forjador: Task-Driven Agent Generation for Quantum Simulation El Agente Forjador:面向量子模拟的任务驱动型智能体生成框架 large language model
24 GDPR Auto-Formalization with AI Agents and Human Verification 提出基于AI Agent和人工验证的GDPR自动形式化框架,提升法律文本处理质量 large language model
25 Quantifying Cross-Query Contradictions in Multi-Query LLM Reasoning 提出一种基于求解器增强的多查询LLM推理方法,解决跨查询逻辑矛盾问题 large language model
26 Analyzing Chain of Thought (CoT) Approaches in Control Flow Code Deobfuscation Tasks 提出基于思维链(CoT)提示的大语言模型代码去混淆方法,提升控制流恢复和语义保持。 large language model chain-of-thought
27 LLMbench: A Comparative Close Reading Workbench for Large Language Models LLMbench:用于大语言模型比较性细读的浏览器工作台 large language model
28 Ragged Paged Attention: A High-Performance and Flexible LLM Inference Kernel for TPU 提出Ragged Paged Attention,为TPU上的LLM推理提供高性能和灵活的内核。 large language model
29 LACE: Lattice Attention for Cross-thread Exploration 提出LACE框架以解决大语言模型推理孤立问题 large language model
30 The Semi-Executable Stack: Agentic Software Engineering and the Expanding Scope of SE 提出半可执行栈模型,应对AI驱动下软件工程范畴扩展至半可执行工件的挑战。 foundation model
31 The Crutch or the Ceiling? How Different Generations of LLMs Shape EFL Student Writings 研究不同代LLM对EFL学生写作的影响:支柱还是天花板? large language model
32 HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents? 提出HarmfulSkillBench,评估LLM智能体在恶意技能环境下的安全性 large language model
33 Exploring LLM-based Verilog Code Generation with Data-Efficient Fine-Tuning and Testbench Automation 提出基于LLM的Verilog代码生成方法,通过数据高效微调和自动化测试平台提升性能。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
34 CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning 提出CoTEvol,通过自进化思维链进行数学推理数据合成 distillation large language model chain-of-thought
35 Disentangle-then-Refine: LLM-Guided Decoupling and Structure-Aware Refinement for Graph Contrastive Learning 提出SDM-SCR框架以解决图对比学习中的信号与噪声纠缠问题 contrastive learning large language model instruction following
36 Towards Faster Language Model Inference Using Mixture-of-Experts Flow Matching 提出MoE-FM框架,加速非自回归语言模型推理,显著提升效率。 flow matching Mamba multimodal
37 Targeted Exploration via Unified Entropy Control for Reinforcement Learning 提出UEC-RL,通过统一熵控制解决强化学习中探索不足和策略坍塌问题 reinforcement learning large language model
38 MARS$^2$: Scaling Multi-Agent Tree Search via Reinforcement Learning for Code Generation 提出MARS$^2$,通过多智能体强化学习树搜索扩展代码生成能力。 reinforcement learning reward shaping
39 IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning 提出IG-Search,利用信息增益奖励提升搜索增强推理能力 reinforcement learning large language model
40 Temporal Cross-Modal Knowledge-Distillation-Based Transfer-Learning for Gas Turbine Vibration Fault Detection 提出时序跨模态知识蒸馏迁移学习框架,用于燃气轮机振动故障检测。 distillation
41 Acceptance Dynamics Across Cognitive Domains in Speculative Decoding 研究推测解码中认知领域对接受率的影响,为领域自适应优化提供依据。 RLHF large language model
42 Asking What Matters: Reward-Driven Clarification for Software Engineering Tasks 提出CLARITI,通过奖励驱动的澄清问题生成,提升软件工程任务效率。 reinforcement learning reward design
43 Targeted Exploration via Unified Entropy Control for Reinforcement Learning 提出UEC-RL,通过统一熵控制解决强化学习中探索不足和策略坍塌问题 reinforcement learning large language model
44 Subliminal Transfer of Unsafe Behaviors in AI Agent Distillation 揭示AI智能体蒸馏中,即使数据经过严格过滤,仍存在不安全行为的隐性传递现象 distillation

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
45 Hijacking Large Audio-Language Models via Context-Agnostic and Imperceptible Auditory Prompt Injection 提出AudioHijack框架,实现对大型语音语言模型在不可察觉的音频提示注入攻击 manipulation
46 CSLE: A Reinforcement Learning Platform for Autonomous Security Management CSLE:用于自主安全管理的强化学习平台,弥合仿真与实际系统差距 recovery control reinforcement learning
47 SecureRouter: Encrypted Routing for Efficient Secure Inference SecureRouter:面向高效安全推理的加密路由框架 MPC

🔬 支柱六:视频提取与匹配 (Video Extraction) (2 篇)

#题目一句话要点标签🔗
48 Learning to Think Like a Cartoon Captionist: Incongruity-Resolution Supervision for Multimodal Humor Understanding 提出IRS框架,通过不协调-解决监督提升多模态幽默理解能力 HuMoR multimodal zero-shot transfer
49 GIST: Multimodal Knowledge Extraction and Spatial Grounding via Intelligent Semantic Topology GIST:通过智能语义拓扑实现多模态知识提取与空间定位 egocentric embodied AI multimodal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
50 ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints ADAPT:针对未明确可供性约束的常识规划基准测试 affordance

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
51 Sequence Search: Automated Sequence Design using Neural Architecture Search 提出基于神经架构搜索的Sequence Search,实现磁共振序列的自动设计。 PULSE

⬅️ 返回 cs.AI 首页 · 🏠 返回主页