cs.AI(2025-12-23)

📊 共 10 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (3 🔗1) 支柱一:机器人控制 (Robot Control) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
1 Automated stereotactic radiosurgery planning using a human-in-the-loop reasoning large language model agent SAGE:基于人机协同推理的大语言模型用于自动化立体定向放射外科计划 large language model chain-of-thought
2 From Visual Perception to Deep Empathy: An Automated Assessment Framework for House-Tree-Person Drawings Using Multimodal LLMs and Multi-Agent Collaboration 提出基于多模态LLM和多智能体协作的HTP绘画自动评估框架 large language model multimodal
3 Dual-Encoder Transformer-Based Multimodal Learning for Ischemic Stroke Lesion Segmentation Using Diffusion MRI 提出基于双编码器Transformer的Ischemic Stroke病灶分割方法,提升DWI和ADC图像的分割精度。 multimodal
4 Reason2Decide: Rationale-Driven Multi-Task Learning Reason2Decide:一种基于理由驱动的多任务学习框架,提升临床决策支持系统的预测精度和解释一致性。 large language model foundation model
5 Generative Digital Twins: Vision-Language Simulation Models for Executable Industrial Systems 提出视觉-语言模拟模型,从草图和文本生成可执行工业系统数字孪生。 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
6 AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent AgentMath:通过工具增强Agent提升大语言模型的数学推理能力 reinforcement learning large language model chain-of-thought
7 LongVideoAgent: Multi-Agent Reasoning with Long Videos 提出LongVideoAgent,利用多智能体推理解决长视频问答中时序定位和细节捕捉难题。 reinforcement learning multimodal
8 Leveraging High-Fidelity Digital Models and Reinforcement Learning for Mission Engineering: A Case Study of Aerial Firefighting Under Perfect Information 利用高保真数字模型和强化学习进行任务工程:以完美信息下的空中消防为例 reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
9 ActionFlow: A Pipelined Action Acceleration for Vision Language Models on Edge ActionFlow:边缘设备上视觉语言模型流水线式动作加速框架 manipulation vision-language-action VLA

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
10 Learning Skills from Action-Free Videos 提出基于光流的技能抽象框架SOF,从无动作视频中学习机器人技能 optical flow

⬅️ 返回 cs.AI 首页 · 🏠 返回主页