cs.CV(2024-11-15)

📊 共 6 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (3) 支柱二:RL算法与架构 (RL & Architecture) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
1 Explanation for Trajectory Planning using Multi-modal Large Language Model for Autonomous Driving 提出基于多模态大语言模型的轨迹规划解释方法,提升自动驾驶决策透明度 large language model
2 Everything is a Video: Unifying Modalities through Next-Frame Prediction 提出基于下一帧预测的多模态统一框架,简化跨模态学习任务。 foundation model multimodal
3 Llama Guard 3 Vision: Safeguarding Human-AI Image Understanding Conversations 提出Llama Guard 3 Vision,用于保障多模态人机对话中的图像理解安全。 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)

#题目一句话要点标签🔗
4 One Leaf Reveals the Season: Occlusion-Based Contrastive Learning with Semantic-Aware Views for Efficient Visual Representation 提出基于遮挡的对比学习OCL,通过语义感知视图高效学习视觉表征。 contrastive learning

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
5 The Oxford Spires Dataset: Benchmarking Large-Scale LiDAR-Visual Localisation, Reconstruction and Radiance Field Methods 发布牛津尖顶数据集,用于大规模激光雷达-视觉定位、重建和辐射场方法评测。 3D gaussian splatting gaussian splatting splatting

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
6 Learning Generalizable 3D Manipulation With 10 Demonstrations 提出基于少量演示学习的通用3D操作框架,提升空间泛化能力 manipulation imitation learning

⬅️ 返回 cs.CV 首页 · 🏠 返回主页