cs.CV(2025-12-22)

📊 共 5 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱一:机器人控制 (Robot Control) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
1 Zero-shot Reconstruction of In-Scene Object Manipulation from Video 提出首个系统,从单目视频零样本重建场景内物体操作过程。 manipulation scene reconstruction physically plausible
2 VLNVerse: A Benchmark for Vision-Language Navigation with Versatile, Embodied, Realistic Simulation and Evaluation VLNVerse:用于视觉-语言导航的多功能、具身、逼真模拟与评估基准 locomotion sim-to-real embodied AI

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
3 CETCAM: Camera-Controllable Video Generation via Consistent and Extensible Tokenization 提出CETCAM框架以解决视频生成中的相机控制问题 depth estimation VGGT geometric consistency

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
4 Point What You Mean: Visually Grounded Instruction Policy 提出Point-VLA,通过视觉引导增强VLA模型在复杂环境中的目标指代能力。 vision-language-action VLA visual grounding

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
5 Towards AI-Guided Open-World Ecological Taxonomic Classification 提出TaxoNet,解决开放世界生态分类中的长尾分布和领域偏移问题 spatiotemporal foundation model multimodal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页