cs.CV（2024-11-15）

📊 共 6 篇论文

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (3) 支柱二：RL算法与架构 (RL & Architecture) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Explanation for Trajectory Planning using Multi-modal Large Language Model for Autonomous Driving	提出基于多模态大语言模型的轨迹规划解释方法，提升自动驾驶决策透明度	large language model
2	Everything is a Video: Unifying Modalities through Next-Frame Prediction	提出基于下一帧预测的多模态统一框架，简化跨模态学习任务。	foundation model multimodal
3	Llama Guard 3 Vision: Safeguarding Human-AI Image Understanding Conversations	提出Llama Guard 3 Vision，用于保障多模态人机对话中的图像理解安全。	multimodal

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
4	One Leaf Reveals the Season: Occlusion-Based Contrastive Learning with Semantic-Aware Views for Efficient Visual Representation	提出基于遮挡的对比学习OCL，通过语义感知视图高效学习视觉表征。	contrastive learning

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
5	The Oxford Spires Dataset: Benchmarking Large-Scale LiDAR-Visual Localisation, Reconstruction and Radiance Field Methods	发布牛津尖顶数据集，用于大规模激光雷达-视觉定位、重建和辐射场方法评测。	3D gaussian splatting gaussian splatting splatting

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
6	Learning Generalizable 3D Manipulation With 10 Demonstrations	提出基于少量演示学习的通用3D操作框架，提升空间泛化能力	manipulation imitation learning

⬅️ 返回 cs.CV 首页 · 🏠 返回主页