cs.CV(2024-06-11)
📊 共 6 篇论文
🎯 兴趣领域导航
支柱三:空间感知与语义 (Perception & Semantics) (3)
支柱一:机器人控制 (Robot Control) (2)
支柱九:具身大模型 (Embodied Foundation Models) (1)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Trim 3D Gaussian Splatting for Accurate Geometry Representation | 提出TrimGS,通过高斯裁剪实现精确3D几何重建 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 2 | ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones | ROADWork:用于学习识别、观察、分析和驾驶通过施工区域的数据集与基准 | open-vocabulary open vocabulary foundation model | ||
| 3 | Neural Visibility Field for Uncertainty-Driven Active Mapping | 提出神经可见性场(NVF),用于不确定性驱动的主动地图构建。 | NeRF neural radiance field scene reconstruction |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | RACon: Retrieval-Augmented Simulated Character Locomotion Control | 提出RACon,一种检索增强的模拟角色运动控制方法,提升用户控制响应性。 | locomotion manipulation reinforcement learning | ||
| 5 | Visual Representation Learning with Stochastic Frame Prediction | 提出基于随机帧预测的视觉表征学习框架,提升视频理解和机器人学习任务性能 | locomotion manipulation representation learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs | VideoLLaMA 2:通过时空建模和音频理解增强视频大语言模型 | large language model multimodal |