cs.CV(2025-03-23)
📊 共 6 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (2 🔗1)
支柱一:机器人控制 (Robot Control) (2)
支柱三:空间感知与语义 (Perception & Semantics) (1)
支柱二:RL算法与架构 (RL & Architecture) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation | 提出基于重写的RAM框架,利用基础模型增强视觉-语言导航的泛化性 | VLN large language model foundation model | ✅ | |
| 2 | Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models | 提出VisPRE框架,通过增强视觉先验知识提升多模态大语言模型性能 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 3 | SG-Tailor: Inter-Object Commonsense Relationship Reasoning for Scene Graph Manipulation | SG-Tailor:提出基于对象间常识关系推理的场景图操作方法 | manipulation | ||
| 4 | PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos | PhysTwin:从视频中物理信息驱动的可变形物体重建与仿真 | motion planning |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | PanopticSplatting: End-to-End Panoptic Gaussian Splatting | PanopticSplatting:提出端到端全景高斯溅射重建方法,实现场景理解与重建。 | gaussian splatting splatting NeRF |
🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization | 提出噪声感知偏好优化方法,解决多模态大语言模型中的模态偏见问题。 | direct preference optimization large language model multimodal |