cs.AI(2025-05-30)

📊 共 39 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (27 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (8) 支柱一:机器人控制 (Robot Control) (3 🔗1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (27 篇)

#题目一句话要点标签🔗
1 Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs 分析多模态LLM在未明确和错误指定场景下的推理能力,并提出改进策略。 large language model multimodal
2 MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge 提出MELT:利用LLM嵌入知识自动标注多模态情感数据 large language model multimodal
3 Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents 提出Open CaptchaWorld平台,用于评估多模态LLM智能体在验证码任务中的推理与交互能力。 multimodal
4 Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models AdaCVD:利用大型语言模型从异构数据中进行自适应心血管疾病风险预测 large language model
5 The World As Large Language Models See It: Exploring the reliability of LLMs in representing geographical features 评估大语言模型地理信息表示能力:GPT-4o和Gemini 2.0在地理空间任务中的可靠性分析 large language model
6 Gated Multimodal Graph Learning for Personalized Recommendation 提出RLMultimodalRec,通过门控多模态图学习实现个性化推荐。 multimodal
7 Towards Scalable Schema Mapping using Large Language Models 提出基于大语言模型的可扩展模式映射方法,解决数据集成中的挑战。 large language model
8 Generative AI for Urban Design: A Stepwise Approach Integrating Human Expertise with Multimodal Diffusion Models 提出一种融合人类专业知识的多模态扩散模型,用于城市设计的逐步生成式AI框架。 multimodal
9 FABLE: A Novel Data-Flow Analysis Benchmark on Procedural Text for Large Language Model Evaluation 提出FABLE基准以评估大型语言模型的数据流推理能力 large language model
10 Evaluation of LLMs for mathematical problem solving 评估大型语言模型在数学问题求解中的能力,揭示不同模型优劣势。 large language model chain-of-thought
11 Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success 提出随机规则森林(RRF),利用LLM生成问题进行可解释的创业成功预测。 large language model
12 Chances and Challenges of the Model Context Protocol in Digital Forensics and Incident Response 探索模型上下文协议在数字取证与事件响应中的应用,提升LLM透明性与可复现性。 large language model
13 MIR: Methodology Inspiration Retrieval for Scientific Research Problems 提出MIR方法,利用方法邻接图MAG提升科研问题的方法灵感检索 large language model
14 Whispers of Many Shores: Cultural Alignment through Collaborative Cultural Expertise 提出基于软提示微调的文化对齐框架,提升LLM的文化敏感性和适应性 large language model
15 Tournament of Prompts: Evolving LLM Instructions Through Structured Debates and Elo Ratings 提出DEEVO:通过辩论驱动的进化算法优化LLM提示,无需预定义指标。 large language model
16 A survey of using EHR as real-world evidence for discovering and validating new drug indications 综述电子病历作为真实世界证据用于新药适应症发现与验证的研究 large language model
17 Memory OS of AI Agent 提出MemoryOS,为AI Agent实现全面高效的记忆管理,提升长期记忆能力和个性化交互体验。 large language model
18 Mixture-of-Experts for Personalized and Semantic-Aware Next Location Prediction 提出NextLocMoE,利用双层MoE结构和LLM增强的个性化语义感知位置预测。 large language model
19 Leveraging Knowledge Graphs and LLMs for Structured Generation of Misinformation 利用知识图谱和大型语言模型结构化生成虚假信息 large language model
20 Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning 针对复杂推理,优化知识图谱与LLM的接口以提升性能 large language model
21 LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs LPASS:利用线性探针加速压缩LLM的漏洞检测,提升效率与性能 large language model
22 RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation RMoA:通过多样性最大化和残差补偿优化混合Agent系统 large language model
23 GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid Environments GridRoute:基于LLM的网格环境路径规划基准与算法引导提示方法 large language model
24 TRAPDOC: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents TRAPDOC:通过注入不可察觉的幻影Token欺骗LLM用户,降低过度依赖 large language model
25 Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules 提出QuAda,通过即插即用模块增强LLM在引用感知对话中的能力 large language model
26 E^2GraphRAG: Streamlining Graph-based RAG for High Efficiency and Effectiveness E^2GraphRAG:优化图RAG,实现高效且有效的知识检索 large language model
27 Learning API Functionality from In-Context Demonstrations for Tool-based Agents 提出一种从上下文演示中学习API功能的方法,用于提升工具型Agent在无文档场景下的任务成功率。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
28 MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning MiCRo:混合建模与上下文感知路由,用于个性化偏好学习 reinforcement learning preference learning RLHF
29 SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought 提出SCOUT框架,通过Flow CoT提升预训练语言模型的推理能力 distillation large language model chain-of-thought
30 How Much Backtracking is Enough? Exploring the Interplay of SFT and RL in Enhancing LLM Reasoning 研究SFT与RL在提升LLM推理能力中的相互作用,探究回溯策略的有效性。 reinforcement learning large language model chain-of-thought
31 AXIOM: Learning to Play Games in Minutes with Expanding Object-Centric Models AXIOM:通过扩展的以对象为中心的模型,在几分钟内学会玩游戏 reinforcement learning deep reinforcement learning DRL
32 A Reward-driven Automated Webshell Malicious-code Generator for Red-teaming 提出RAWG,一种奖励驱动的自动化Webshell恶意代码生成器,用于红队演练。 reinforcement learning PPO large language model
33 Control-R: Towards controllable test-time scaling 提出Control-R,通过可控推理控制解决大语言模型长链推理中的欠思考和过度思考问题 distillation chain-of-thought
34 ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction ProofNet++:一种基于自校正的神经符号系统,用于形式化证明验证 reinforcement learning large language model
35 Intrinsic Goals for Autonomous Agents: Model-Based Exploration in Virtual Zebrafish Predicts Ethological Behavior and Whole-Brain Dynamics 提出3M-Progress模型,通过自监督学习预测斑马鱼行为和全脑动态,实现类动物自主探索。 reinforcement learning world model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
36 SEAR: A Multimodal Dataset for Analyzing AR-LLM-Driven Social Engineering Behaviors SEAR:用于分析AR-LLM驱动的社会工程行为的多模态数据集 manipulation large language model multimodal
37 Adversarial Threat Vectors and Risk Mitigation for Retrieval-Augmented Generation Systems 分析RAG系统对抗攻击向量并提出风险缓解措施 manipulation large language model
38 SentinelAgent: Graph-based Anomaly Detection in Multi-Agent Systems 提出SentinelAgent,用于多智能体系统中基于图的异常检测 manipulation large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
39 A Red Teaming Roadmap Towards System-Level Safety 提出LLM红队测试新路线图,关注系统级安全与真实威胁模型 affordance large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页