cs.AI（2026-03-31）

📊 共 35 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (26 🔗5) 支柱二：RL算法与架构 (RL & Architecture) (6 🔗1) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (26 篇)

#	题目	一句话要点	标签	🔗
1	Webscraper: Leverage Multimodal Large Language Models for Index-Content Web Scraping	Webscraper：利用多模态大语言模型进行索引-内容型网页抓取	large language model multimodal
2	Xuanwu: Evolving General Multimodal Models into an Industrial-Grade Foundation for Content Ecosystems	Xuanwu VL-2B：面向内容生态的工业级通用多模态基础模型	foundation model multimodal
3	AEC-Bench: A Multimodal Benchmark for Agentic Systems in Architecture, Engineering, and Construction	AEC-Bench：用于建筑、工程和建造领域智能体系统的多模态基准测试	foundation model multimodal	✅
4	Bethe Ansatz with a Large Language Model	利用大型语言模型求解坐标Bethe Ansatz，发现新型可积自旋链模型。	large language model
5	ScoringBench: A Benchmark for Evaluating Tabular Foundation Models with Proper Scoring Rules	提出ScoringBench，利用Proper Scoring Rules评估表格型预训练模型，提升决策质量。	foundation model	✅
6	Spontaneous Functional Differentiation in Large Language Models: A Brain-Like Intelligence Economy	大型语言模型涌现自发功能分化，形成类脑智能经济	large language model
7	KEditVis: A Visual Analytics System for Knowledge Editing of Large Language Models	KEditVis：用于大语言模型知识编辑的可视分析系统	large language model
8	Knowledge database development by large language models for countermeasures against viruses and marine toxins	利用大型语言模型构建病毒和海洋毒素的知识库，加速医疗对策研发。	large language model
9	SciVisAgentBench: A Benchmark for Evaluating Scientific Data Analysis and Visualization Agents	提出SciVisAgentBench，用于评估科学数据分析与可视化Agent的基准。	large language model multimodal	✅
10	ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation	提出ATP-Bench基准测试，用于评估MLLM在交错生成任务中的Agentic Tool Planning能力。	large language model multimodal	✅
11	Software Vulnerability Detection Using a Lightweight Graph Neural Network	提出VulGNN，一种轻量级图神经网络，用于高效软件漏洞检测。	large language model
12	Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks	提出面向AI Agent的系统级防御架构，应对间接Prompt注入攻击	large language model
13	SISA: A Scale-In Systolic Array for GEMM Acceleration	SISA：一种用于GEMM加速的可伸缩片上系统阵列	large language model
14	Owl-AuraID 1.0: An Intelligent System for Autonomous Scientific Instrumentation and Scientific Data Analysis	Owl-AuraID：基于GUI原生操作的自主科学仪器智能系统	multimodal	✅
15	BotVerse: Real-Time Event-Driven Simulation of Social Agents	BotVerse：基于LLM智能体，用于实时事件驱动的社交模拟框架	multimodal
16	Measuring the metacognition of AI	提出使用 meta-d' 框架和信号检测理论评估AI的元认知能力，提升AI决策可靠性。	large language model
17	Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupling and the Limits of the Dunning-Kruger Metaphor	AI介导的元认知解耦：超越邓宁-克鲁格效应，揭示LLM使用的认知影响	large language model
18	View-oriented Conversation Compiler for Agent Trace Analysis	提出VCC，通过编译Agent对话日志生成结构化视图，提升上下文学习效果。	chain-of-thought
19	An Empirical Study of Multi-Agent Collaboration for Automated Research	针对自动化研究，对比多智能体协作框架的性能与稳定性	large language model
20	ELT-Bench-Verified: Benchmark Quality Issues Underestimate AI Agent Capabilities	ELT-Bench-Verified：揭示基准质量问题低估AI Agent能力，并提出改进方案	large language model
21	Sima AIunty: Caste Audit in LLM-Driven Matchmaking	Sima AIunty：LLM婚恋匹配中基于种姓的偏见审计	large language model
22	Route-Induced Density and Stability (RIDE): Controlled Intervention and Mechanism Analysis of Routing-Style Meta Prompts on LLM Internal States	RIDE：通过路由式元提示干预和分析LLM内部状态，揭示密度与稳定性的关系	large language model
23	SimMOF: AI agent for Automated MOF Simulations	SimMOF：基于LLM的多智能体框架，自动化金属有机框架模拟流程	large language model
24	REFINE: Real-world Exploration of Interactive Feedback and Student Behaviour	REFINE：探索交互式反馈与学生行为的真实世界交互式反馈系统	large language model
25	GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification	GISTBench：提出基于证据的用户兴趣验证基准，评估LLM在推荐系统中的用户理解能力	large language model
26	WybeCoder: Verified Imperative Code Generation	提出WybeCoder，实现指令式代码生成过程中的同步验证，提升代码质量。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (6 篇)

#	题目	一句话要点	标签	🔗
27	Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification	提出基于流量混合的均值掩码自编码器MMAE，用于加密流量分类。	masked autoencoder MAE teacher-student	✅
28	Reinforced Reasoning for End-to-End Retrosynthetic Planning	提出ReTriP，用于端到端逆合成规划，提升长程规划的鲁棒性。	reinforcement learning distillation chain-of-thought
29	Self-Improving Code Generation via Semantic Entropy and Behavioral Consensus	ConSelf：基于语义熵和行为共识的自提升代码生成方法	DPO direct preference optimization large language model
30	Learning to Generate Formally Verifiable Step-by-Step Logic Reasoning via Structured Formal Intermediaries	提出PRoSFI，通过结构化形式中间表示提升LLM逻辑推理可验证性	reinforcement learning large language model
31	ShapE-GRPO: Shapley-Enhanced Reward Allocation for Multi-Candidate LLM Training	ShapE-GRPO：利用Shapley值优化多候选LLM训练中的奖励分配	reinforcement learning large language model
32	ASI-Evolve: AI Accelerates AI	ASI-Evolve：利用AI加速AI自身发展，实现数据、架构和算法的AI驱动发现	reinforcement learning linear attention

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
33	Semantic Interaction for Narrative Map Sensemaking: An Insight-based Evaluation	提出基于语义交互的叙事地图，用于提升叙事理解的洞察力。	manipulation
34	Security in LLM-as-a-Judge: A Comprehensive SoK	首个LLM-as-a-Judge安全知识体系化研究，揭示潜在风险与防御策略。	manipulation

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
35	CausalPulse: An Industrial-Grade Neurosymbolic Multi-Agent Copilot for Causal Diagnostics in Smart Manufacturing	CausalPulse：用于智能制造因果诊断的工业级神经符号多智能体协同系统	PULSE

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2026-03-31）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (26 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (6 篇)

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理