cs.CV(2025-12-16)

📊 共 6 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (3) 支柱一:机器人控制 (Robot Control) (2) 支柱九:具身大模型 (Embodied Foundation Models) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
1 WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling WorldPlay:提出一种具有长期几何一致性的实时交互式世界建模方法 world model distillation geometric consistency
2 A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning 提出A4-Agent,一个零样本具身智能框架,用于解决物体交互区域的推理问题。 dreamer affordance embodied AI
3 TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs TimeLens:利用多模态LLM重新思考视频时序定位任务,并构建高质量基线。 reinforcement learning large language model multimodal

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
4 CRISP: Contact-Guided Real2Sim from Monocular Video with Planar Scene Primitives CRISP:基于单目视频和平面场景原语的接触引导Real2Sim方法 humanoid humanoid control real2sim
5 DRAW2ACT: Turning Depth-Encoded Trajectories into Robotic Demonstration Videos DRAW2ACT:提出深度感知的轨迹条件视频生成框架,用于机器人操作演示视频生成。 manipulation embodied AI multimodal

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
6 HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices HyperVL:面向边缘设备的高效动态多模态大语言模型 large language model multimodal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页