cs.AI(2026-04-30)

📊 共 50 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (36 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (9) 支柱一:机器人控制 (Robot Control) (2) 支柱三:空间感知与语义 (Perception & Semantics) (2) 支柱七:动作重定向 (Motion Retargeting) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (36 篇)

#题目一句话要点标签🔗
1 From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation 揭示MLLM电路图到Verilog代码生成中的“幻影”现象,提出VeriGround模型提升可靠性。 large language model multimodal visual grounding
2 Heterogeneous Scientific Foundation Model Collaboration Eywa:异构科学基础模型协作框架,扩展Agentic LLM在科学领域的应用 large language model foundation model
3 InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation? InteractWeb-Bench:评估多模态Agent在交互式网站生成中避免盲目执行的能力 large language model multimodal
4 Design Structure Matrix Modularization with Large Language Models 利用大语言模型进行设计结构矩阵模块化,无需专业优化代码。 large language model
5 Iterative Multimodal Retrieval-Augmented Generation for Medical Question Answering 提出MED-VRAG,一种迭代多模态检索增强生成框架,用于医学问答。 multimodal
6 Belief-Guided Inference Control for Large Language Model Services via Verifiable Observations 提出Veroic框架,通过可验证观测实现大语言模型服务中风险感知的推理控制。 large language model
7 SpecVQA: A Benchmark for Spectral Understanding and Visual Question Answering in Scientific Images 提出SpecVQA:科学图像中光谱理解与视觉问答的专业评测基准。 large language model multimodal
8 The Effects of Visual Priming on Cooperative Behavior in Vision-Language Models 研究视觉启动对视觉语言模型合作行为的影响,以迭代囚徒困境为测试场景 chain-of-thought
9 Exploring Interaction Paradigms for LLM Agents in Scientific Visualization 探索LLM Agent在科学可视化中的交互范式,权衡性能、鲁棒性和灵活性。 large language model
10 Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs 提出MEDS数据集,用于评估LLM在数学教育中的能力、偏差及心理特征。 large language model
11 What Makes a Good Terminal-Agent Benchmark Task: A Guideline for Adversarial, Difficult, and Legible Evaluation Design 为终端代理基准测试任务设计提供指导,强调对抗性、难度和可读性 large language model
12 Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents CARE:一种三方协作的AI Agent工程方法,提升科学领域LLM Agent开发效率 large language model
13 LLMs as ASP Programmers: Self-Correction Enables Task-Agnostic Nonmonotonic Reasoning 提出LLM+ASP框架,利用自校正实现任务无关的非单调推理 large language model
14 MM-StanceDet: Retrieval-Augmented Multi-modal Multi-agent Stance Detection 提出MM-StanceDet,通过检索增强的多智能体框架解决多模态立场检测中的融合难题。 multimodal
15 MCPHunt: An Evaluation Framework for Cross-Boundary Data Propagation in Multi-Server MCP Agents MCPHunt:多服务器MCP代理中跨边界数据传播的评估框架 instruction following
16 Post-Optimization Adaptive Rank Allocation for LoRA 提出PARA,一种LoRA后优化自适应秩分配方法,提升参数效率。 foundation model
17 Test Before You Deploy: Governing Updates in the LLM Supply Chain 提出LLM供应链治理框架,保障部署端LLM更新的兼容性和安全性 large language model
18 RuC: HDL-Agnostic Rule Completion Benchmark Generation RuC:一种与硬件描述语言无关的、基于规则的可控代码补全基准生成框架 large language model
19 Intent2Tx: Benchmarking LLMs for Translating Natural Language Intents into Ethereum Transactions Intent2Tx:构建基准测试,评估LLM将自然语言意图转化为以太坊交易的能力 large language model
20 Position-Aware Drafting for Inference Acceleration in LLM-Based Generative List-Wise Recommendation PAD-Rec:针对LLM生成式列表推荐的位置感知草稿加速推理 large language model
21 AgentEconomist: An End-to-end Agentic System Translating Economic Intuitions into Executable Computational Experiments AgentEconomist:将经济学直觉转化为可执行计算实验的端到端智能系统 large language model
22 Bridging Values and Behavior: A Hierarchical Framework for Proactive Embodied Agents 提出ValuePlanner,解决具身智能体长期自主行为决策问题 instruction following
23 When Agents Evolve, Institutions Follow 提出基于历史政治制度的多智能体架构,提升LLM的集体智能。 large language model
24 HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs HAVEN:一种混合自动化验证引擎,利用LLM进行UVM测试平台综合 large language model
25 Optimization before Evaluation: Evaluation with Unoptimised Prompts Can be Misleading 揭示LLM评估陷阱:未优化Prompt可能导致模型排序失真 large language model
26 Political Bias Audits of LLMs Capture Sycophancy to the Inferred Auditor 揭示大型语言模型的政治偏见与迎合审计者的关系 large language model
27 In-Context Examples Suppress Scientific Knowledge Recall in LLMs 在LLM中,上下文示例会抑制科学知识的调用,导致模型倾向于经验模式拟合。 large language model
28 Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study 针对自主Agent框架的安全风险,提出分层分析与防御策略综述 large language model
29 Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations 提出端到端LLM框架,自动化SOC威胁检测、查询生成和事件响应。 large language model
30 METASYMBO: Multi-Agent Language-Guided Metamaterial Discovery via Symbolic Latent Evolution 提出MetaSymbO,通过符号驱动的潜在演化实现语言引导的多智能体超材料发现。 large language model
31 Evaluating Epistemic Guardrails in AI Reading Assistants: A Behavioral Audit of a Minimal Prototype 提出评估AI阅读助手认知防护栏的协议,揭示交互行为动态及边界功能。 large language model
32 ARMOR 2025: A Military-Aligned Benchmark for Evaluating Large Language Model Safety Beyond Civilian Contexts 提出ARMOR 2025:一个面向军事场景的大语言模型安全评估基准 large language model
33 Minimal, Local, Causal Explanations for Jailbreak Success in Large Language Models 提出LOCA方法,为大语言模型越狱攻击提供最小化、局部化和因果解释 large language model
34 DeGenTWeb: A First Look at LLM-dominant Websites DeGenTWeb:首次系统性识别并分析LLM主导的网站,揭示其普遍性和演变趋势 large language model
35 A Survey of Reasoning-Intensive Retrieval: Progress and Challenges 针对推理密集型检索的综述:系统性地分析现有方法,并展望未来研究方向 large language model
36 TADI: Tool-Augmented Drilling Intelligence via Agentic LLM Orchestration over Heterogeneous Wellsite Data TADI:通过Agentic LLM编排异构井场数据,实现工具增强的钻井智能 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
37 Rethinking Agentic Reinforcement Learning In Large Language Models 基于大语言模型的Agentic强化学习:重新思考智能体自主性 reinforcement learning large language model
38 WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning WaferSAGE:利用合成数据和规则引导强化学习进行晶圆缺陷分析 reinforcement learning large language model
39 Simulating clinical interventions with a generative multimodal model of human physiology HealthFormer:基于Transformer的生成式多模态模型,用于模拟临床干预和预测生理轨迹。 world model world models multimodal
40 GUI Agents with Reinforcement Learning: Toward Digital Inhabitants 提出GUI智能体研究综述,探索强化学习在GUI自动化中的应用及未来方向 reinforcement learning offline RL world model
41 Graph World Models: Concepts, Taxonomy, and Future Directions 提出图世界模型(GWM)概念,系统性地解决传统世界模型在复杂环境中的局限性。 world model world models
42 LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis 提出基于LLM的临床图结构精炼方法,提升脑电癫痫诊断的表征学习 representation learning large language model
43 Towards Neuro-symbolic Causal Rule Synthesis, Verification, and Evaluation Grounded in Legal and Safety Principles 提出神经符号因果规则合成框架,用于安全关键场景下的规则自动生成与验证。 reinforcement learning deep reinforcement learning large language model
44 TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimization 提出TUR-DPO,一种拓扑和不确定性感知的直接偏好优化方法,提升LLM推理能力。 reinforcement learning PPO RLHF
45 XekRung Technical Report XekRung:面向网络安全的先进大规模语言模型 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
46 PRTS: A Primitive Reasoning and Tasking System via Contrastive Representations PRTS:通过对比表示进行原始推理和任务处理的机器人VLA基础模型 manipulation reachability-aware reinforcement learning
47 RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses RHyVE:针对LLM生成奖励假设,提出能力感知验证和阶段感知部署方法 manipulation reinforcement learning reward design

🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)

#题目一句话要点标签🔗
48 A Pattern Language for Resilient Visual Agents 提出一种视觉Agent架构模式语言,解决企业环境中视觉语言行为模型的集成挑战。 affordance vision-language-action VLA
49 Knowledge Affordances for Hybrid Human-AI Information Seeking 提出知识可供性(KA)概念,用于混合人机环境中智能体的信息寻求决策。 affordance

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
50 SpatialGrammar: A Domain-Specific Language for LLM-Based 3D Indoor Scene Generation 提出SpatialGrammar领域特定语言,提升LLM生成3D室内场景的空间一致性。 spatial relationship embodied AI

⬅️ 返回 cs.AI 首页 · 🏠 返回主页