cs.AI(2025-05-26)

📊 共 13 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (11 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (11 篇)

#题目一句话要点标签🔗
1 On Path to Multimodal Historical Reasoning: HistBench and HistAgent 提出HistBench历史推理基准和HistAgent,提升AI在历史领域的多模态理解能力。 generalist agent large language model multimodal
2 ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows ScienceBoard:构建多模态自主Agent的科学工作流评估基准 large language model multimodal
3 Project Riley: Multimodal Multi-Agent LLM Collaboration with Emotional Reasoning and Voting Project Riley:提出一种基于情感推理和投票的多模态多智能体LLM协作框架 large language model multimodal
4 Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution 提出Alita以解决现有智能体适应性不足的问题 generalist agent large language model
5 Large Language Models as Autonomous Spacecraft Operators in Kerbal Space Program 利用大型语言模型作为Kerbal太空计划中的自主航天器操作员 large language model
6 From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data BALSa:利用合成数据引导音频-语言对齐,提升ALLM性能并缓解幻觉问题 large language model instruction following
7 Ten Principles of AI Agent Economics 提出AI Agent经济学十大原则,旨在负责任地将AI Agent整合到人类社会经济系统中。 multimodal
8 Capability-Based Scaling Laws for LLM Red-Teaming 提出基于能力的LLM红队攻防扩展法则,预测攻击成功率 large language model
9 StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs StructEval:全面评估LLM生成结构化输出能力的基准测试 large language model
10 DCG-SQL: Enhancing In-Context Learning for Text-to-SQL with Deep Contextual Schema Link Graph 提出DCG-SQL,通过深度上下文Schema链接图增强Text-to-SQL的上下文学习能力 large language model
11 HS-STaR: Hierarchical Sampling for Self-Taught Reasoners via Difficulty Estimation and Budget Reallocation 提出HS-STaR,通过分层采样提升自训练推理器在数学问题上的学习效率 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
12 SCAR: Shapley Credit Assignment for More Efficient RLHF SCAR:基于Shapley值的信用分配方法,提升RLHF训练效率 reinforcement learning RLHF large language model
13 MultiPhishGuard: An LLM-based Multi-Agent System for Phishing Email Detection 提出MultiPhishGuard,一种基于LLM多智能体系统的钓鱼邮件检测方法 reinforcement learning chain-of-thought

⬅️ 返回 cs.AI 首页 · 🏠 返回主页