cs.CV(2025-07-12)
📊 共 16 篇论文 | 🔗 9 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (7 🔗4)
支柱二:RL算法与架构 (RL & Architecture) (4 🔗2)
支柱一:机器人控制 (Robot Control) (1)
支柱三:空间感知与语义 (Perception & Semantics) (1 🔗1)
支柱四:生成式动作 (Generative Motion) (1 🔗1)
支柱五:交互与反应 (Interaction & Reaction) (1 🔗1)
支柱六:视频提取与匹配 (Video Extraction) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | Prompt4Trust: A Reinforcement Learning Prompt Augmentation Framework for Clinically-Aligned Confidence Calibration in Multimodal Large Language Models | 提出Prompt4Trust以解决多模态大语言模型的信心校准问题 | reinforcement learning large language model multimodal | ✅ | |
| 9 | Stable Score Distillation | 提出Stable Score Distillation,提升文本引导图像和3D编辑的稳定性和对齐性 | distillation NeRF classifier-free guidance | ||
| 10 | Geo-RepNet: Geometry-Aware Representation Learning for Surgical Phase Recognition in Endoscopic Submucosal Dissection | Geo-RepNet:针对内镜黏膜下剥离术中手术阶段识别的几何感知表征学习 | representation learning spatial relationship | ||
| 11 | Cross Knowledge Distillation between Artificial and Spiking Neural Networks | 提出跨模态知识蒸馏(CKD)方法,提升SNN在DVS数据上的性能 | distillation | ✅ |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | Multimodal Visual Transformer for Sim2real Transfer in Visual Reinforcement Learning | 提出基于多模态视觉Transformer的Sim2Real迁移学习方法 | manipulation sim2real domain randomization |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 13 | Fast3D: Accelerating 3D Multi-modal Large Language Models for Efficient 3D Scene Understanding | Fast3D:加速3D多模态大语言模型,实现高效3D场景理解 | scene understanding large language model | ✅ |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 14 | SnapMoGen: Human Motion Generation from Expressive Texts | SnapMoGen:提出高质量文本驱动人体运动生成数据集与改进的生成模型MoMask++ | text-to-motion motion generation long-term motion generation | ✅ |
🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | RoHOI: Robustness Benchmark for Human-Object Interaction Detection | 提出RoHOI基准测试,用于评估和提升人-物交互检测在现实扰动下的鲁棒性。 | human-object interaction HOI | ✅ |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 16 | EgoAnimate: Generating Human Animations from Egocentric top-down Views | EgoAnimate:从第一人称视角生成可动画的人体模型 | egocentric |