cs.AI（2026-04-20）

📊 共 31 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (19 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (9 🔗2) 支柱四：生成式动作 (Generative Motion) (1) 支柱七：动作重定向 (Motion Retargeting) (1) 支柱八：物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (19 篇)

#	题目	一句话要点	标签	🔗	⭐
1	WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models	提出WebCompass以解决现有编码评估方法的局限性	large language model multimodal
2	Using large language models for embodied planning introduces systematic safety risks	大型语言模型具身规划存在系统性安全风险	large language model
3	MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval	MathNet：一个用于数学推理和检索的全局多模态基准数据集	multimodal
4	Multi-Agent Systems: From Classical Paradigms to Large Foundation Model-Enabled Futures	综述多智能体系统：从经典范式到大模型赋能的未来	foundation model
5	Stability Implies Redundancy: Delta Attention Selective Halting for Efficient Long-Context Prefilling	提出DASH：利用Delta注意力选择性停止加速长文本预填充，保持硬件效率。	large language model multimodal	✅
6	Bayesian Active Learning with Gaussian Processes Guided by LLM Relevance Scoring for Dense Passage Retrieval	提出BAGEL，利用LLM指导高斯过程主动学习，提升稠密检索效果	large language model multimodal
7	Do LLMs Need to See Everything? A Benchmark and Study of Failures in LLM-driven Smartphone Automation using Screentext vs. Screenshots	DailyDroid：针对LLM驱动的智能手机自动化，对比文本与截图输入，揭示其失效模式	large language model multimodal
8	Benchmarking System Dynamics AI Assistants: Cloud Versus Local LLMs on CLD Extraction and Discussion	系统评估云端与本地LLM在系统动力学AI助手中的表现	large language model
9	AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization	提出AQPIM以解决大规模语言模型的内存激活量化问题	large language model
10	Adversarial Arena: Crowdsourcing Data Generation through Interactive Competition	提出Adversarial Arena，通过交互式对抗众包生成高质量LLM训练数据。	large language model
11	Document-as-Image Representations Fall Short for Scientific Retrieval	揭示文档图像表征在科学文档检索中的局限性，并提出基于LaTeX源的新基准。	multimodal
12	Six Llamas: Comparative Religious Ethics Through LoRA-Adapted Language Models	Six Llamas：通过LoRA适配的语言模型进行比较宗教学伦理研究	large language model
13	Evaluating Multi-Hop Reasoning in RAG Systems: A Comparison of LLM-Based Retriever Evaluation Strategies	提出上下文感知检索评估(CARE)，提升RAG系统多跳推理评估的准确性	large language model	✅
14	From Fallback to Frontline: When Can LLMs be Superior Annotators of Human Perspectives?	挑战传统认知：大语言模型在人类观点标注任务中超越人类标注者	large language model
15	RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs	RAVEN：检索增强的漏洞探索网络，用于用户代码和二进制程序中的内存损坏分析	large language model
16	WebUncertainty: Dual-Level Uncertainty Driven Planning and Reasoning For Autonomous Web Agent	WebUncertainty：双重不确定性驱动的自主Web代理规划与推理	large language model
17	Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective	揭示代码大语言模型中因Tokenization导致的密钥泄露风险	large language model
18	Contrastive Attribution in the Wild: An Interpretability Analysis of LLM Failures on Realistic Benchmarks	提出基于LRP对比归因方法，分析LLM在真实benchmark上的失效原因。	large language model
19	Co-evolving Agent Architectures and Interpretable Reasoning for Automated Optimization	EvoOR-Agent：提出一种协同进化框架，用于自动优化运筹学问题。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

#	题目	一句话要点	标签	🔗	⭐
20	SELF-EMO: Emotional Self-Evolution from Recognition to Consistent Expression	提出SELF-EMO框架，解决对话情感识别中数据稀缺和情感表达一致性问题	reinforcement learning motion prediction large language model
21	OGER: A Robust Offline-Guided Exploration Reward for Hybrid Reinforcement Learning	提出OGER，通过离线引导探索奖励解决混合强化学习中探索不足问题	reinforcement learning large language model	✅
22	LeGo-Code: Can Modular Curriculum Learning Advance Complex Code Generation? Insights from Text-to-SQL	提出模块化课程学习方法LeGo-Code，提升LLM在Text-to-SQL复杂代码生成任务上的性能。	curriculum learning large language model
23	QuantumQA: Enhancing Scientific Reasoning via Physics-Consistent Dataset and Verification-Aware Reinforcement Learning	提出QuantumQA数据集和VRM模型，提升LLM在量子力学领域的科学推理能力	reinforcement learning large language model
24	PARM: Pipeline-Adapted Reward Model	提出PARM，解决多阶段LLM流水线中奖励模型与执行结果不一致问题	RLHF direct preference optimization large language model
25	Training and Agentic Inference Strategies for LLM-based Manim Animation Generation	提出ManimTrainer和ManimAgent，用于训练和推理LLM生成Manim动画，提升代码质量和视觉效果。	reinforcement learning large language model
26	Learning from Less: Measuring the Effectiveness of RLVR in Low Data and Compute Regimes	在低数据和计算资源下，研究RLVR在小语言模型上的有效性	reinforcement learning large language model
27	Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence	Agent-World：通过可扩展环境合成提升通用智能体能力	reinforcement learning large language model
28	AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation	提出AJ-Bench基准，用于评估Agent-as-a-Judge在环境感知评估中的能力	reinforcement learning large language model	✅

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
29	SPREG: Structured Plan Repair with Entropy-Guided Test-Time Intervention for Large Language Model Reasoning	SPREG：基于熵引导测试时干预的结构化计划修复，提升大语言模型推理能力	classifier-free guidance large language model

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
30	Understanding Human Actions through the Lens of Executable Models	提出EXACT领域特定语言，用于理解和建模人类动作序列	human motion

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
31	DSAINet: An Efficient Dual-Scale Attentive Interaction Network for General EEG Decoding	提出DSAINet，用于解决通用脑电解码中跨任务泛化性差的问题。	spatiotemporal	✅

⬅️ 返回 cs.AI 首页 · 🏠 返回主页