cs.CV(2025-10-09)
📊 共 4 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (2 🔗1)
支柱三:空间感知与语义 (Perception & Semantics) (1)
支柱九:具身大模型 (Embodied Foundation Models) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Dream to Recall: Imagination-Guided Experience Retrieval for Memory-Persistent Vision-and-Language Navigation | Memoir:提出基于想象引导的经验检索,提升记忆持久性视觉语言导航性能。 | world model VLN language conditioned | ✅ | |
| 2 | Gaze on the Prize: Shaping Visual Attention with Return-Guided Contrastive Learning | 提出基于回报引导对比学习的视觉注意力机制,提升强化学习样本效率 | reinforcement learning contrastive learning |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 3 | A Multimodal Depth-Aware Method For Embodied Reference Understanding | 提出一种多模态深度感知方法,用于具身引用理解任务。 | open-vocabulary open vocabulary multimodal |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities | BEAR:原子具身能力的多模态语言模型基准测试与增强 | large language model multimodal |