cs.AI（2025-05-30）

📊 共 39 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (27 🔗4) 支柱二：RL算法与架构 (RL & Architecture) (8) 支柱一：机器人控制 (Robot Control) (3 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (27 篇)

#	题目	一句话要点	标签	🔗
1	Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs	分析多模态LLM在未明确和错误指定场景下的推理能力，并提出改进策略。	large language model multimodal
2	MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge	提出MELT：利用LLM嵌入知识自动标注多模态情感数据	large language model multimodal
3	Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents	提出Open CaptchaWorld平台，用于评估多模态LLM智能体在验证码任务中的推理与交互能力。	multimodal
4	Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models	AdaCVD：利用大型语言模型从异构数据中进行自适应心血管疾病风险预测	large language model
5	The World As Large Language Models See It: Exploring the reliability of LLMs in representing geographical features	评估大语言模型地理信息表示能力：GPT-4o和Gemini 2.0在地理空间任务中的可靠性分析	large language model
6	Gated Multimodal Graph Learning for Personalized Recommendation	提出RLMultimodalRec，通过门控多模态图学习实现个性化推荐。	multimodal
7	Towards Scalable Schema Mapping using Large Language Models	提出基于大语言模型的可扩展模式映射方法，解决数据集成中的挑战。	large language model
8	Generative AI for Urban Design: A Stepwise Approach Integrating Human Expertise with Multimodal Diffusion Models	提出一种融合人类专业知识的多模态扩散模型，用于城市设计的逐步生成式AI框架。	multimodal
9	FABLE: A Novel Data-Flow Analysis Benchmark on Procedural Text for Large Language Model Evaluation	提出FABLE基准以评估大型语言模型的数据流推理能力	large language model
10	Evaluation of LLMs for mathematical problem solving	评估大型语言模型在数学问题求解中的能力，揭示不同模型优劣势。	large language model chain-of-thought
11	Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success	提出随机规则森林（RRF），利用LLM生成问题进行可解释的创业成功预测。	large language model
12	Chances and Challenges of the Model Context Protocol in Digital Forensics and Incident Response	探索模型上下文协议在数字取证与事件响应中的应用，提升LLM透明性与可复现性。	large language model
13	MIR: Methodology Inspiration Retrieval for Scientific Research Problems	提出MIR方法，利用方法邻接图MAG提升科研问题的方法灵感检索	large language model
14	Whispers of Many Shores: Cultural Alignment through Collaborative Cultural Expertise	提出基于软提示微调的文化对齐框架，提升LLM的文化敏感性和适应性	large language model
15	Tournament of Prompts: Evolving LLM Instructions Through Structured Debates and Elo Ratings	提出DEEVO：通过辩论驱动的进化算法优化LLM提示，无需预定义指标。	large language model
16	A survey of using EHR as real-world evidence for discovering and validating new drug indications	综述电子病历作为真实世界证据用于新药适应症发现与验证的研究	large language model
17	Memory OS of AI Agent	提出MemoryOS，为AI Agent实现全面高效的记忆管理，提升长期记忆能力和个性化交互体验。	large language model	✅
18	Mixture-of-Experts for Personalized and Semantic-Aware Next Location Prediction	提出NextLocMoE，利用双层MoE结构和LLM增强的个性化语义感知位置预测。	large language model
19	Leveraging Knowledge Graphs and LLMs for Structured Generation of Misinformation	利用知识图谱和大型语言模型结构化生成虚假信息	large language model
20	Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning	针对复杂推理，优化知识图谱与LLM的接口以提升性能	large language model
21	LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs	LPASS：利用线性探针加速压缩LLM的漏洞检测，提升效率与性能	large language model
22	RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation	RMoA：通过多样性最大化和残差补偿优化混合Agent系统	large language model	✅
23	GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid Environments	GridRoute：基于LLM的网格环境路径规划基准与算法引导提示方法	large language model	✅
24	TRAPDOC: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents	TRAPDOC：通过注入不可察觉的幻影Token欺骗LLM用户，降低过度依赖	large language model	✅
25	Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules	提出QuAda，通过即插即用模块增强LLM在引用感知对话中的能力	large language model
26	E^2GraphRAG: Streamlining Graph-based RAG for High Efficiency and Effectiveness	E^2GraphRAG：优化图RAG，实现高效且有效的知识检索	large language model
27	Learning API Functionality from In-Context Demonstrations for Tool-based Agents	提出一种从上下文演示中学习API功能的方法，用于提升工具型Agent在无文档场景下的任务成功率。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签
28	MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning	MiCRo：混合建模与上下文感知路由，用于个性化偏好学习	reinforcement learning preference learning RLHF
29	SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought	提出SCOUT框架，通过Flow CoT提升预训练语言模型的推理能力	distillation large language model chain-of-thought
30	How Much Backtracking is Enough? Exploring the Interplay of SFT and RL in Enhancing LLM Reasoning	研究SFT与RL在提升LLM推理能力中的相互作用，探究回溯策略的有效性。	reinforcement learning large language model chain-of-thought
31	AXIOM: Learning to Play Games in Minutes with Expanding Object-Centric Models	AXIOM：通过扩展的以对象为中心的模型，在几分钟内学会玩游戏	reinforcement learning deep reinforcement learning DRL
32	A Reward-driven Automated Webshell Malicious-code Generator for Red-teaming	提出RAWG，一种奖励驱动的自动化Webshell恶意代码生成器，用于红队演练。	reinforcement learning PPO large language model
33	Control-R: Towards controllable test-time scaling	提出Control-R，通过可控推理控制解决大语言模型长链推理中的欠思考和过度思考问题	distillation chain-of-thought
34	ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction	ProofNet++：一种基于自校正的神经符号系统，用于形式化证明验证	reinforcement learning large language model
35	Intrinsic Goals for Autonomous Agents: Model-Based Exploration in Virtual Zebrafish Predicts Ethological Behavior and Whole-Brain Dynamics	提出3M-Progress模型，通过自监督学习预测斑马鱼行为和全脑动态，实现类动物自主探索。	reinforcement learning world model

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签	🔗
36	SEAR: A Multimodal Dataset for Analyzing AR-LLM-Driven Social Engineering Behaviors	SEAR：用于分析AR-LLM驱动的社会工程行为的多模态数据集	manipulation large language model multimodal	✅
37	Adversarial Threat Vectors and Risk Mitigation for Retrieval-Augmented Generation Systems	分析RAG系统对抗攻击向量并提出风险缓解措施	manipulation large language model
38	SentinelAgent: Graph-based Anomaly Detection in Multi-Agent Systems	提出SentinelAgent，用于多智能体系统中基于图的异常检测	manipulation large language model

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
39	A Red Teaming Roadmap Towards System-Level Safety	提出LLM红队测试新路线图，关注系统级安全与真实威胁模型	affordance large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2025-05-30）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (27 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理