cs.AI（2026-05-07）

📊 共 94 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (64 🔗7) 支柱二：RL算法与架构 (RL & Architecture) (27 🔗1) 支柱八：物理动画 (Physics-based Animation) (1 🔗1) 支柱一：机器人控制 (Robot Control) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (64 篇)

#	题目	一句话要点	标签	🔗
1	Data Language Models: A New Foundation Model Class for Tabular Data	提出数据语言模型（DLM），为表格数据提供原生理解能力，无需预处理。	large language model foundation model
2	Multimodal Deep Generative Model for Semi-Supervised Learning under Class Imbalance	提出一种多模态深度生成模型，解决类别不平衡下的半监督学习问题。	multimodal
3	Graphlets as Building Blocks for Structural Vocabulary in Knowledge Graph Foundation Models	提出基于Graphlet结构词汇的知识图谱基础模型，提升零样本迁移能力	foundation model
4	NeuroAgent: LLM Agents for Multimodal Neuroimaging Analysis and Research	NeuroAgent：基于LLM的多模态神经影像分析智能体框架	multimodal
5	Debiased Multimodal Personality Understanding through Dual Causal Intervention	提出双重因果干预网络DCAN，解决多模态人格理解中的偏差问题。	multimodal	✅
6	Mind the Gap? A Distributional Comparison of Real and Synthetic Priors for Tabular Foundation Models	对比分析真实与合成表格数据先验分布差异，评估其对表格预训练模型性能的影响	foundation model
7	CoupleEvo: Evolving Heuristics for Coupled Optimization Problems Using Large Language Models	CoupleEvo：利用大语言模型进化耦合优化问题的启发式算法	large language model	✅
8	GlazyBench: A Benchmark for Ceramic Glaze Property Prediction and Image Generation	GlazyBench：用于陶瓷釉料属性预测与图像生成的基准数据集	large language model multimodal
9	Super-Level-Set Regression: Conditional Quantiles via Volume Minimization	提出超水平集回归(SLS)，通过最小化体积直接学习条件分位数，解决多元回归问题。	multimodal
10	Rethinking Adapter Placement: A Dominant Adaptation Module Perspective	提出DomLoRA，通过单适配器放置实现参数高效的微调，优于传统LoRA。	instruction following
11	MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems	MASPO：面向LLM多智能体系统的联合提示优化框架	large language model	✅
12	Process Matters more than Output for Distinguishing Humans from Machines	提出CogCAPTCHA30认知任务集，通过过程特征而非输出区分人类与机器。	large language model
13	PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors	PrefixGuard：从LLM-Agent轨迹到在线故障预警监控器	large language model
14	Constraint Decay: The Fragility of LLM Agents in Backend Code Generation	揭示LLM Agent在后端代码生成中结构约束下的脆弱性，发现“约束衰减”现象	large language model
15	SCRuB: Social Concept Reasoning under Rubric-Based Evaluation	提出SCRuB框架，用于评估大语言模型在社会概念推理方面的能力。	large language model
16	Knowledge Graphs, the Missing Link in Agentic AI-based Formal Verification	提出基于知识图谱的Agentic AI形式验证方法，提升SystemVerilog断言生成质量。	large language model
17	From Agent Loops to Deterministic Graphs: Execution Lineage for Reproducible AI-Native Work	提出执行谱系，通过确定性图解决AI原生工作流的可复现性问题	large language model
18	Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Systems Perspective	提出人机协同进化动态系统模型，揭示AI依赖可能导致认知退化风险	large language model
19	Fine-Tuning Small Language Models for Solution-Oriented Windows Event Log Analysis	微调小型语言模型，解决面向解决方案的Windows事件日志分析难题	large language model
20	Improving the Efficiency of Language Agent Teams with Adaptive Task Graphs	LATTE框架通过自适应任务图提升语言代理团队的效率，降低资源消耗。	large language model
21	Measuring Black-Box Confidence via Reasoning Trajectories: Geometry, Coverage, and Verbalization	提出基于推理轨迹几何、覆盖度和文本置信度的黑盒置信度评估方法	chain-of-thought
22	Addressing Labelled Data Scarcity: Taxonomy-Agnostic Annotation of PII Values in HTTP Traffic using LLMs	提出基于LLM的HTTP流量PII值分类标注方法，解决标注数据稀缺和分类体系固定的问题。	large language model
23	Correct Code, Vulnerable Dependencies: A Large Scale Measurement Study of LLM-Specified Library Versions	大规模研究揭示LLM生成代码中库版本选择的安全漏洞与兼容性风险	large language model	✅
24	OmicsLM: A Multimodal Large Language Model for Multi-Sample Omics Reasoning	提出多模态大语言模型OmicsLM，实现转录组定量数据与自然语言生物学推理的深度融合。	large language model multimodal instruction following
25	ICU-Bench:Benchmarking Continual Unlearning in Multimodal Large Language Models	提出ICU-Bench基准以评估多模态大模型在持续学习场景下的隐私遗忘能力	large language model multimodal
26	Causal Probing for Internal Visual Representations in Multimodal Large Language Models	提出基于因果干预的探测框架，揭示多模态大模型内部视觉表征的编码机制与缩放规律	large language model multimodal
27	AstroAlertBench: Evaluating the Accuracy, Reasoning, and Honesty of Multimodal LLMs in Astronomical Classification	提出AstroAlertBench基准测试，评估多模态大模型在天文瞬变事件分类中的准确性、推理能力与诚实度。	large language model multimodal
28	An Interpretable and Scalable Framework for Evaluating Large Language Models	提出基于Majorization-Minimization的IRT评估框架，实现大模型能力评估的可解释性与高效扩展	large language model
29	Quantum-enhanced Large Language Models on Quantum Hardware via Cayley Unitary Adapters	提出基于Cayley酉矩阵适配器的量子增强大语言模型，在真实量子硬件上实现性能提升	large language model
30	Saliency-Aware Regularized Quantization Calibration for Large Language Models	提出显著性感知正则化量化校准(SARQC)，提升大语言模型量化后性能。	large language model
31	Systematic Evaluation of Large Language Models for Post-Discharge Clinical Action Extraction	提出两阶段提示框架，系统评估大语言模型在出院临床行动提取任务中的表现	large language model
32	LCC-LLM: Leveraging Code-Centric Large Language Models for Malware Attribution	提出LCC-LLM框架与LCCD数据集，通过代码中心化检索增强与多任务推理实现精准恶意软件归因	large language model
33	DataDignity: Training Data Attribution for Large Language Models	提出DataDignity框架与FakeWiki基准，通过监督对比学习实现大语言模型训练数据溯源	large language model
34	Extracting Search Trees from LLM Reasoning Traces Reveals Myopic Planning	通过从LLM推理轨迹中提取搜索树，揭示其规划过程中的近视性特征	large language model chain-of-thought
35	Beyond Fixed Benchmarks and Worst-Case Attacks: Dynamic Boundary Evaluation for Language Models	提出动态边界评估（DBE）框架，通过自适应搜索解决大模型静态基准测试的饱和与偏差问题。	large language model instruction following
36	CrossCult-KIBench: A Benchmark for Cross-Cultural Knowledge Insertion in MLLMs	提出CrossCult-KIBench基准与MCKI方法，以解决多模态大模型跨文化知识注入与对齐难题	large language model multimodal
37	LLM-Driven Design Space Exploration of FPGA-based Accelerators	提出SECDA-DSE框架，利用大语言模型驱动FPGA加速器的自动化设计空间探索	large language model chain-of-thought
38	Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning	提出基于零空间约束的对比视觉遗忘方法，实现多模态大模型的高效知识移除	large language model multimodal
39	LeakDojo: Decoding the Leakage Threats of RAG Systems	提出LeakDojo评估框架，系统性揭示检索增强生成（RAG）系统的知识泄露风险	large language model instruction following	✅
40	Conceal, Reconstruct, Jailbreak: Exploiting the Reconstruction-Concealment Tradeoff in MLLMs	提出基于重构-隐蔽权衡的MLLM越狱攻击框架，通过字符移除与关键词干扰提升攻击成功率	large language model multimodal
41	SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety	提出SafeHarbor框架：通过分层记忆增强与自演化机制，解决LLM智能体安全防御中的过度拒绝问题。	large language model foundation model	✅
42	CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency	提出CITE算法，实现大模型自洽性采样中任意时刻有效的统计推断与错误控制	large language model
43	Active Learning for Communication Structure Optimization in LLM-Based Multi-Agent Systems	提出基于集成卡尔曼反演的主动学习框架，优化大模型多智能体系统的通信结构	large language model
44	From Surface Learning to Deep Understanding: A Grounded AI Tutoring System for Moodle	提出基于检索增强生成（RAG）的Moodle AI教学助手，实现教育内容的精准溯源与苏格拉底式交互。	large language model
45	How Well Do LLMs Perform on the Simplest Long-Chain Reasoning Tasks: An Empirical Study on the Equivalence Class Problem	通过等价类问题（ECP）实证评估大语言模型在长链推理任务中的性能表现	large language model
46	LLM-Guided Open Hypothesis Learning from Autonomous Scanning Probe Microscopy Experiments	提出基于大模型引导的开放式假设学习框架，实现扫描探针显微镜的自主科学发现	large language model
47	When Does Critique Improve AI-Assisted Theoretical Physics? SCALAR: Structured Critic--Actor Loop for Agentic Reasoning	提出SCALAR框架：通过结构化批评-行动循环提升AI在理论物理研究中的推理能力	large language model
48	A Self-Healing Framework for Reliable LLM-Based Autonomous Agents	提出一种面向LLM自主智能体的自愈框架，通过故障检测与动态重规划提升系统鲁棒性。	large language model
49	Towards Annotation-Free Validation of MLLMs: A Vision-Language Logical Consistency Metric	提出视觉-语言逻辑一致性度量（VL-LCM），实现无需标注的MLLM评估	large language model
50	The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models	揭示大模型社会角色表征的“粒度轴”：一种微观到宏观的潜在因果方向	large language model
51	Event-Causal RAG: A Retrieval-Augmented Generation Framework for Long Video Reasoning in Complex Scenarios	提出Event-Causal RAG框架，通过事件因果图与双重存储机制实现超长视频的因果推理。	foundation model
52	Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost	提出Post-Reasoning方法，通过后置推理机制提升非思维链模型性能且零推理成本	large language model
53	Back to the Beginning of Heuristic Design: Bridging Code and Knowledge with LLMs	提出基于知识优先的自动启发式设计框架，通过LLM实现组合优化中代码与知识的深度融合。	large language model
54	Visual Fingerprints for LLM Generation Comparison	提出基于视觉指纹的方法，用于比较不同生成条件下LLM的输出倾向。	large language model
55	Safety Anchor: Defending Harmful Fine-tuning via Geometric Bottlenecks	提出安全瓶颈正则化（SBR）方法，通过几何锚点防御大模型的有害微调攻击	large language model
56	MAS-Algorithm: A Workflow for Solving Algorithmic Programming Problems with a Multi-Agent System	提出MAS-Algorithm多智能体工作流，通过模块化协作提升AI算法编程问题的求解能力	chain-of-thought
57	Taklif.AI: LLM-Powered Platform for Interest-Based Personalized College Assignments	提出Taklif.AI平台，利用大语言模型实现基于学生兴趣与文化背景的个性化作业生成	large language model
58	CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt	提出CircuitFormer与电路专用分词器CKT，实现基于自然语言的模拟电路拓扑自动设计	large language model	✅
59	ReFlect: An Effective Harness System for Complex Long-Horizon LLM Reasoning	提出ReFlect推理框架：通过确定性封装实现长程任务的错误检测与自动恢复	chain-of-thought
60	An Empirical Study of Proactive Coding Assistants in Real-World Software Development	揭示主动式编程助手仿真与现实的鸿沟：提出ProCodeBench基准与真实行为数据集	large language model
61	Chain of Risk: Safety Failures in Large Reasoning Models and Mitigation via Adaptive Multi-Principle Steering	提出自适应多原则引导（AMPS）框架，解决大型推理模型（LRM）推理链中的安全隐患问题。	chain-of-thought
62	Text-Graph Synergy: A Bidirectional Verification and Completion Framework for RAG	提出TGS-RAG框架，通过文本与知识图谱的双向协同机制解决RAG中的信息孤岛与推理路径丢失问题。	large language model
63	From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms	提出LLM智能体记忆演进框架：从存储、反射到经验的三阶段范式	large language model
64	Prober.ai: Gated Inquiry-Based Feedback via LLM-Constrained Personas for Argumentative Writing Development	提出Prober.ai：基于门控式探究反馈与LLM约束角色的论证写作辅助系统，旨在缓解AI辅助写作带来的认知负债。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (27 篇)

#	题目	一句话要点	标签	🔗
65	Coordination Matters: Evaluation of Cooperative Multi-Agent Reinforcement Learning	提出面向协作多智能体强化学习的协调感知评估方法，解决传统指标的局限性。	reinforcement learning
66	Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key	提出ScaleLogic框架，研究逻辑表达能力对RL训练LLM长程推理的影响	reinforcement learning large language model
67	Learning to Cut: Reinforcement Learning for Benders Decomposition	提出基于强化学习的Benders分解方法，加速求解两阶段随机规划问题	reinforcement learning
68	Safactory: A Scalable Agent Factory for Trustworthy Autonomous Intelligence	Safactory：用于可信自主智能的可扩展Agent工厂	reinforcement learning distillation
69	AGWM: Affordance-Grounded World Models for Environments with Compositional Prerequisites	提出AGWM：一种基于可供性基础的世界模型，用于解决具有组合先决条件的复杂环境建模问题。	world model world models affordance
70	HaM-World: Soft-Hamiltonian World Models with Selective Memory for Planning	提出HaM-World模型，通过软哈密顿动力学与选择性记忆机制提升长程规划稳定性	world model world models latent dynamics
71	Multi-Objective Constraint Inference using Inverse reinforcement learning	提出多目标约束推理（MOCI）框架，解决异构专家演示下的约束与偏好联合学习问题。	reinforcement learning inverse reinforcement learning preference learning
72	BehaviorGuard: Online Backdoor Defense for Deep Reinforcement Learning	提出BehaviorGuard框架，通过监测动作分布偏移实现深度强化学习的在线后门防御	reinforcement learning deep reinforcement learning DRL
73	Policy-Guided Stepwise Model Routing for Cost-Effective Reasoning	提出基于策略引导的逐步模型路由方法，以实现大语言模型推理的高效能与低成本平衡。	reinforcement learning large language model chain-of-thought
74	Resolving the bias-precision paradox with stochastic causal representation learning for personalized medicine	提出基于随机因果表征学习的sMMD方法，解决个性化医疗中的偏差-精度悖论	representation learning large language model
75	Mitigating Cognitive Bias in RLHF by Altering Rationality	提出基于动态理性参数调整的RLHF方法，以缓解人类反馈中的认知偏差问题	reinforcement learning RLHF
76	Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning	Skill1：通过强化学习统一进化技能增强型智能体，解决技能选择、利用和提炼的协同优化问题。	reinforcement learning distillation
77	PREFER: Personalized Review Summarization with Online Preference Learning	提出PREFER在线偏好学习框架，实现针对用户动态需求的个性化评论摘要生成	preference learning
78	Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement	证明Transformer可通过参数构造实现上下文强化学习并提供收敛性保证	reinforcement learning
79	Nonsense Helps: Prompt Space Perturbation Broadens Reasoning Exploration	提出LoPE框架：通过提示词空间扰动解决大模型强化学习中的零优势问题	reinforcement learning large language model
80	Behavior Cue Reasoning: Monitorable Reasoning Improves Efficiency and Safety through Oversight	提出行为线索推理（BCR）框架，通过显式标记提升大模型推理过程的可监控性与安全性	reinforcement learning large language model	✅
81	Agentick: A Unified Benchmark for General Sequential Decision-Making Agents	提出Agentick基准测试框架，实现对强化学习与大模型智能体在序列决策任务上的统一评估。	PPO foundation model
82	Randomness is sometimes necessary for coordination	提出Diamond Attention机制，通过引入结构化随机性解决多智能体强化学习中的角色分化难题	reinforcement learning zero-shot transfer
83	Schedule-and-Calibrate: Utility-Guided Multi-Task Reinforcement Learning for Code LLMs	提出ASTOR框架，通过效用引导的多任务强化学习提升代码大模型性能	reinforcement learning
84	Novelty-based Tree-of-Thought Search for LLM Reasoning and Planning	提出基于新颖性度量的思维树搜索方法，以优化大语言模型的推理与规划效率	reinforcement learning chain-of-thought
85	AGPO: Asymmetric Group Policy Optimization for Verifiable Reasoning and Search Ads Relevance at JD	提出非对称组策略优化（AGPO）算法，解决大模型强化学习中的推理边界收缩问题	reinforcement learning large language model
86	SDFlow: Similarity-Driven Flow Matching for Time Series Generation	提出SDFlow框架：利用相似度驱动的流匹配技术实现高效长序列时间序列生成	flow matching
87	SPARK: Self-Play with Asymmetric Reward from Knowledge Graphs	提出SPARK框架：利用知识图谱实现非对称奖励的自我博弈，提升科学文献的多跳推理能力	reinforcement learning multimodal
88	P-Guide: Parameter-Efficient Prior Steering for Single-Pass CFG Inference	提出P-Guide框架：通过初始潜空间调制实现单次推理的无分类器引导（CFG）	flow matching classifier-free guidance
89	X-Voice: Enabling Everyone to Speak 30 Languages via Zero-Shot Cross-Lingual Voice Cloning	提出X-Voice：基于两阶段流匹配训练的0.4B参数多语言零样本语音克隆模型	flow matching classifier-free guidance
90	Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence	提出Safactory框架，构建可扩展的智能体工厂以实现可信自主智能的闭环演进	reinforcement learning distillation
91	OPSD Compresses What RLVR Teaches: A Post-RL Compaction Stage for Reasoning Models	提出OPSD后训练压缩阶段，通过在RLVR后对推理模型进行蒸馏以缩短响应长度	reinforcement learning distillation

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
92	SpatialEpiBench: Benchmarking Spatial Information and Epidemic Priors in Forecasting	SpatialEpiBench：构建空间流行病预测基准，揭示现有方法在实际应用中的局限性	spatiotemporal	✅

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
93	Uneven Evolution of Cognition Across Generations of Generative AI Models	提出基于心理测量学的AIQ基准，揭示生成式AI模型认知能力演进的非均衡性与架构偏差	manipulation multimodal

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
94	Narrow Secret Loyalty Dodges Black-Box Audits	提出窄域秘密忠诚攻击模型，揭示大语言模型在黑盒审计下的隐蔽性威胁	affordance

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2026-05-07）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (64 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (27 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理