cs.AI(2025-12-23)
📊 共 10 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (5 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (3 🔗1)
支柱一:机器人控制 (Robot Control) (1)
支柱三:空间感知与语义 (Perception & Semantics) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Automated stereotactic radiosurgery planning using a human-in-the-loop reasoning large language model agent | SAGE:基于人机协同推理的大语言模型用于自动化立体定向放射外科计划 | large language model chain-of-thought | ||
| 2 | From Visual Perception to Deep Empathy: An Automated Assessment Framework for House-Tree-Person Drawings Using Multimodal LLMs and Multi-Agent Collaboration | 提出基于多模态LLM和多智能体协作的HTP绘画自动评估框架 | large language model multimodal | ||
| 3 | Dual-Encoder Transformer-Based Multimodal Learning for Ischemic Stroke Lesion Segmentation Using Diffusion MRI | 提出基于双编码器Transformer的Ischemic Stroke病灶分割方法,提升DWI和ADC图像的分割精度。 | multimodal | ||
| 4 | Reason2Decide: Rationale-Driven Multi-Task Learning | Reason2Decide:一种基于理由驱动的多任务学习框架,提升临床决策支持系统的预测精度和解释一致性。 | large language model foundation model | ||
| 5 | Generative Digital Twins: Vision-Language Simulation Models for Executable Industrial Systems | 提出视觉-语言模拟模型,从草图和文本生成可执行工业系统数字孪生。 | multimodal | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent | AgentMath:通过工具增强Agent提升大语言模型的数学推理能力 | reinforcement learning large language model chain-of-thought | ||
| 7 | LongVideoAgent: Multi-Agent Reasoning with Long Videos | 提出LongVideoAgent,利用多智能体推理解决长视频问答中时序定位和细节捕捉难题。 | reinforcement learning multimodal | ✅ | |
| 8 | Leveraging High-Fidelity Digital Models and Reinforcement Learning for Mission Engineering: A Case Study of Aerial Firefighting Under Perfect Information | 利用高保真数字模型和强化学习进行任务工程:以完美信息下的空中消防为例 | reinforcement learning |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | ActionFlow: A Pipelined Action Acceleration for Vision Language Models on Edge | ActionFlow:边缘设备上视觉语言模型流水线式动作加速框架 | manipulation vision-language-action VLA |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | Learning Skills from Action-Free Videos | 提出基于光流的技能抽象框架SOF,从无动作视频中学习机器人技能 | optical flow |