cs.AI（2025-12-23）

📊 共 10 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (3 🔗1) 支柱一：机器人控制 (Robot Control) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (5 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Automated stereotactic radiosurgery planning using a human-in-the-loop reasoning large language model agent	SAGE：基于人机协同推理的大语言模型用于自动化立体定向放射外科计划	large language model chain-of-thought
2	From Visual Perception to Deep Empathy: An Automated Assessment Framework for House-Tree-Person Drawings Using Multimodal LLMs and Multi-Agent Collaboration	提出基于多模态LLM和多智能体协作的HTP绘画自动评估框架	large language model multimodal
3	Dual-Encoder Transformer-Based Multimodal Learning for Ischemic Stroke Lesion Segmentation Using Diffusion MRI	提出基于双编码器Transformer的Ischemic Stroke病灶分割方法，提升DWI和ADC图像的分割精度。	multimodal
4	Reason2Decide: Rationale-Driven Multi-Task Learning	Reason2Decide：一种基于理由驱动的多任务学习框架，提升临床决策支持系统的预测精度和解释一致性。	large language model foundation model
5	Generative Digital Twins: Vision-Language Simulation Models for Executable Industrial Systems	提出视觉-语言模拟模型，从草图和文本生成可执行工业系统数字孪生。	multimodal	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
6	AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent	AgentMath：通过工具增强Agent提升大语言模型的数学推理能力	reinforcement learning large language model chain-of-thought
7	LongVideoAgent: Multi-Agent Reasoning with Long Videos	提出LongVideoAgent，利用多智能体推理解决长视频问答中时序定位和细节捕捉难题。	reinforcement learning multimodal	✅
8	Leveraging High-Fidelity Digital Models and Reinforcement Learning for Mission Engineering: A Case Study of Aerial Firefighting Under Perfect Information	利用高保真数字模型和强化学习进行任务工程：以完美信息下的空中消防为例	reinforcement learning

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
9	ActionFlow: A Pipelined Action Acceleration for Vision Language Models on Edge	ActionFlow：边缘设备上视觉语言模型流水线式动作加速框架	manipulation vision-language-action VLA

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
10	Learning Skills from Action-Free Videos	提出基于光流的技能抽象框架SOF，从无动作视频中学习机器人技能	optical flow

⬅️ 返回 cs.AI 首页 · 🏠 返回主页