cs.AI(2026-03-31)

📊 共 35 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (26 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (6 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (26 篇)

#题目一句话要点标签🔗
1 Webscraper: Leverage Multimodal Large Language Models for Index-Content Web Scraping Webscraper:利用多模态大语言模型进行索引-内容型网页抓取 large language model multimodal
2 Xuanwu: Evolving General Multimodal Models into an Industrial-Grade Foundation for Content Ecosystems Xuanwu VL-2B:面向内容生态的工业级通用多模态基础模型 foundation model multimodal
3 AEC-Bench: A Multimodal Benchmark for Agentic Systems in Architecture, Engineering, and Construction AEC-Bench:用于建筑、工程和建造领域智能体系统的多模态基准测试 foundation model multimodal
4 Bethe Ansatz with a Large Language Model 利用大型语言模型求解坐标Bethe Ansatz,发现新型可积自旋链模型。 large language model
5 ScoringBench: A Benchmark for Evaluating Tabular Foundation Models with Proper Scoring Rules 提出ScoringBench,利用Proper Scoring Rules评估表格型预训练模型,提升决策质量。 foundation model
6 Spontaneous Functional Differentiation in Large Language Models: A Brain-Like Intelligence Economy 大型语言模型涌现自发功能分化,形成类脑智能经济 large language model
7 KEditVis: A Visual Analytics System for Knowledge Editing of Large Language Models KEditVis:用于大语言模型知识编辑的可视分析系统 large language model
8 Knowledge database development by large language models for countermeasures against viruses and marine toxins 利用大型语言模型构建病毒和海洋毒素的知识库,加速医疗对策研发。 large language model
9 SciVisAgentBench: A Benchmark for Evaluating Scientific Data Analysis and Visualization Agents 提出SciVisAgentBench,用于评估科学数据分析与可视化Agent的基准。 large language model multimodal
10 ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation 提出ATP-Bench基准测试,用于评估MLLM在交错生成任务中的Agentic Tool Planning能力。 large language model multimodal
11 Software Vulnerability Detection Using a Lightweight Graph Neural Network 提出VulGNN,一种轻量级图神经网络,用于高效软件漏洞检测。 large language model
12 Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks 提出面向AI Agent的系统级防御架构,应对间接Prompt注入攻击 large language model
13 SISA: A Scale-In Systolic Array for GEMM Acceleration SISA:一种用于GEMM加速的可伸缩片上系统阵列 large language model
14 Owl-AuraID 1.0: An Intelligent System for Autonomous Scientific Instrumentation and Scientific Data Analysis Owl-AuraID:基于GUI原生操作的自主科学仪器智能系统 multimodal
15 BotVerse: Real-Time Event-Driven Simulation of Social Agents BotVerse:基于LLM智能体,用于实时事件驱动的社交模拟框架 multimodal
16 Measuring the metacognition of AI 提出使用 meta-d' 框架和信号检测理论评估AI的元认知能力,提升AI决策可靠性。 large language model
17 Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupling and the Limits of the Dunning-Kruger Metaphor AI介导的元认知解耦:超越邓宁-克鲁格效应,揭示LLM使用的认知影响 large language model
18 View-oriented Conversation Compiler for Agent Trace Analysis 提出VCC,通过编译Agent对话日志生成结构化视图,提升上下文学习效果。 chain-of-thought
19 An Empirical Study of Multi-Agent Collaboration for Automated Research 针对自动化研究,对比多智能体协作框架的性能与稳定性 large language model
20 ELT-Bench-Verified: Benchmark Quality Issues Underestimate AI Agent Capabilities ELT-Bench-Verified:揭示基准质量问题低估AI Agent能力,并提出改进方案 large language model
21 Sima AIunty: Caste Audit in LLM-Driven Matchmaking Sima AIunty:LLM婚恋匹配中基于种姓的偏见审计 large language model
22 Route-Induced Density and Stability (RIDE): Controlled Intervention and Mechanism Analysis of Routing-Style Meta Prompts on LLM Internal States RIDE:通过路由式元提示干预和分析LLM内部状态,揭示密度与稳定性的关系 large language model
23 SimMOF: AI agent for Automated MOF Simulations SimMOF:基于LLM的多智能体框架,自动化金属有机框架模拟流程 large language model
24 REFINE: Real-world Exploration of Interactive Feedback and Student Behaviour REFINE:探索交互式反馈与学生行为的真实世界交互式反馈系统 large language model
25 GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification GISTBench:提出基于证据的用户兴趣验证基准,评估LLM在推荐系统中的用户理解能力 large language model
26 WybeCoder: Verified Imperative Code Generation 提出WybeCoder,实现指令式代码生成过程中的同步验证,提升代码质量。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
27 Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification 提出基于流量混合的均值掩码自编码器MMAE,用于加密流量分类。 masked autoencoder MAE teacher-student
28 Reinforced Reasoning for End-to-End Retrosynthetic Planning 提出ReTriP,用于端到端逆合成规划,提升长程规划的鲁棒性。 reinforcement learning distillation chain-of-thought
29 Self-Improving Code Generation via Semantic Entropy and Behavioral Consensus ConSelf:基于语义熵和行为共识的自提升代码生成方法 DPO direct preference optimization large language model
30 Learning to Generate Formally Verifiable Step-by-Step Logic Reasoning via Structured Formal Intermediaries 提出PRoSFI,通过结构化形式中间表示提升LLM逻辑推理可验证性 reinforcement learning large language model
31 ShapE-GRPO: Shapley-Enhanced Reward Allocation for Multi-Candidate LLM Training ShapE-GRPO:利用Shapley值优化多候选LLM训练中的奖励分配 reinforcement learning large language model
32 ASI-Evolve: AI Accelerates AI ASI-Evolve:利用AI加速AI自身发展,实现数据、架构和算法的AI驱动发现 reinforcement learning linear attention

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
33 Semantic Interaction for Narrative Map Sensemaking: An Insight-based Evaluation 提出基于语义交互的叙事地图,用于提升叙事理解的洞察力。 manipulation
34 Security in LLM-as-a-Judge: A Comprehensive SoK 首个LLM-as-a-Judge安全知识体系化研究,揭示潜在风险与防御策略。 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
35 CausalPulse: An Industrial-Grade Neurosymbolic Multi-Agent Copilot for Causal Diagnostics in Smart Manufacturing CausalPulse:用于智能制造因果诊断的工业级神经符号多智能体协同系统 PULSE

⬅️ 返回 cs.AI 首页 · 🏠 返回主页