cs.AI(2025-10-06)

📊 共 33 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (24 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (7) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (24 篇)

#题目一句话要点标签🔗
1 Think Then Embed: Generative Context Improves Multimodal Embedding 提出Think-Then-Embed框架,利用生成式上下文提升通用多模态嵌入性能。 large language model multimodal chain-of-thought
2 ChartAgent: A Multimodal Agent for Visually Grounded Reasoning in Complex Chart Question Answering 提出ChartAgent,通过视觉推理解决复杂图表问答中未标注图表的理解难题 multimodal chain-of-thought
3 Large Language Models Achieve Gold Medal Performance at the International Olympiad on Astronomy & Astrophysics (IOAA) 大语言模型在国际天文与天体物理奥赛中达到金牌水平 large language model multimodal
4 Efficient Prediction of Pass@k Scaling in Large Language Models 提出基于Beta-Binomial分布的Pass@k预测方法,提升大语言模型能力与风险评估效率。 large language model
5 Exploring Student Choice and the Use of Multimodal Generative AI in Programming Learning 探索多模态生成式AI在编程学习中的应用与学生选择偏好 multimodal
6 BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions BIRD-INTERACT:通过动态交互视角重新定义大语言模型Text-to-SQL的评测标准 large language model
7 AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials AtomWorld:用于评估大语言模型在晶体材料空间推理能力的基准 large language model
8 Improving Multimodal Brain Encoding Model with Dynamic Subject-awareness Routing 提出AFIRE与MIND框架,解决自然场景下多模态脑编码模型的主体差异问题。 multimodal
9 LEGOMem: Modular Procedural Memory for Multi-agent LLM Systems for Workflow Automation LEGOMem:面向工作流自动化的多智能体LLM系统的模块化程序记忆 large language model
10 Bridging Reasoning to Learning: Unmasking Illusions using Complexity Out of Distribution Generalization 提出复杂度分布外泛化框架,用于评估和提升AI的推理能力。 large language model
11 BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs 提出BrokenMath基准,评估LLM在定理证明中对错误结论的盲从性 large language model
12 VAL-Bench: Belief Consistency as a measure for Value Alignment in Language Models VAL-Bench:提出基于信念一致性的语言模型价值观对齐评测基准。 large language model
13 UnitTenX: Generating Tests for Legacy Packages with AI Agents Powered by Formal Verification UnitTenX:利用形式化验证驱动的AI Agent为遗留软件包生成单元测试 large language model
14 AInstein: Assessing the Feasibility of AI-Generated Approaches to Research Problems AInstein框架评估LLM在无外部辅助下解决AI研究问题的可行性 large language model
15 AutoDAN-Reasoning: Enhancing Strategies Exploration based Jailbreak Attacks with Test-Time Scaling AutoDAN-Reasoning:通过测试时缩放增强基于策略探索的LLM越狱攻击 large language model
16 DeepV: A Model-Agnostic Retrieval-Augmented Framework for Verilog Code Generation with a High-Quality Knowledge Base DeepV:一种模型无关的RAG框架,通过高质量知识库提升Verilog代码生成效果。 large language model
17 Staircase Streaming for Low-Latency Multi-Agent Inference 提出Staircase Streaming,解决多Agent推理中高延迟问题,显著降低TTFT。 large language model
18 AutoEmpirical: LLM-Based Automated Research for Empirical Software Fault Analysis AutoEmpirical:利用大语言模型自动进行软件缺陷的实证研究 large language model
19 LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game LLM-Hanabi:利用Hanabi评估LLM在不完美信息协作中的心智理论和理性推断能力 large language model
20 Where Did It All Go Wrong? A Hierarchical Look into Multi-Agent Error Attribution 提出ECHO算法,通过层级上下文和客观共识分析提升多智能体系统错误归因的准确性。 large language model
21 FreshBrew: A Benchmark for Evaluating AI Agents on Java Code Migration FreshBrew:用于评估AI Agent在Java代码迁移任务上的基准测试 large language model
22 Natural Language Edge Labelling: Decoupling Intent from Execution in Structured LM Reasoning 提出自然语言边缘标签(NLEL),解耦结构化LM推理中的意图与执行,提升可控性和可审计性。 chain-of-thought
23 Curved Boolean Logic: A Contextual Generalization of Propositional Logic with Algorithmic Consequences 提出弯曲布尔逻辑,通过局部真值赋值泛化命题逻辑,并提供算法优化。 large language model
24 P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs 提出P2P:一种用于LLM可靠后门防御的投毒解毒方法 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)

#题目一句话要点标签🔗
25 LMM-Incentive: Large Multimodal Model-based Incentive Design for User-Generated Content in Web 3.0 提出基于大模型激励机制LMM-Incentive,解决Web 3.0中用户生成内容质量问题。 PPO multimodal
26 Beyond Monolithic Rewards: A Hybrid and Multi-Aspect Reward Optimization for MLLM Alignment 提出混合多维度奖励优化框架,提升多模态大语言模型对齐效果 reinforcement learning large language model multimodal
27 Teacher-Student Guided Inverse Modeling for Steel Final Hardness Estimation 提出基于师生学习的逆向建模方法,用于估计钢材最终硬度 reinforcement learning teacher-student
28 MARS: Co-evolving Dual-System Deep Research via Multi-Agent Reinforcement Learning MARS:通过多智能体强化学习共同进化双系统深度研究 reinforcement learning
29 Video Game Level Design as a Multi-Agent Reinforcement Learning Problem 提出基于多智能体强化学习的游戏关卡自动生成方法,提升生成效率与泛化性。 reinforcement learning
30 Provable Speech Attributes Conversion via Latent Independence 提出基于潜在独立性的语音属性转换框架,实现可控且可靠的语音风格迁移。 representation learning multimodal
31 Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents 提出DeSA框架,解耦搜索与回答,提升LLM Agent的问答准确率 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
32 Integrating Bayesian methods with neural network--based model predictive control: a review 综述:贝叶斯方法与基于神经网络的模型预测控制集成研究 MPC model predictive control
33 Hybrid-Balance GFlowNet for Solving Vehicle Routing Problems 提出混合平衡GFlowNet,融合轨迹平衡与细节平衡求解车辆路径问题 trajectory optimization

⬅️ 返回 cs.AI 首页 · 🏠 返回主页