cs.CV(2024-06-30)
📊 共 8 篇论文 | 🔗 4 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (2 🔗2)
支柱七:动作重定向 (Motion Retargeting) (2 🔗1)
支柱六:视频提取与匹配 (Video Extraction) (1 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (1)
支柱一:机器人控制 (Robot Control) (1)
支柱三:空间感知与语义 (Perception & Semantics) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation | 提出CaFNet,利用雷达置信度提升雷达相机深度估计精度 | MAE depth estimation | ✅ | |
| 2 | Diffusion Models and Representation Learning: A Survey | 综述扩散模型与表征学习的交叉研究,探索其在视觉任务中的应用与潜力。 | representation learning | ✅ |
🔬 支柱七:动作重定向 (Motion Retargeting) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 3 | HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis | 提出HATs方法,用于全景病理图像中复杂解剖结构的分层自适应分割。 | spatial relationship foundation model | ✅ | |
| 4 | Engineering an Efficient Object Tracker for Non-Linear Motion | DeepMoveSORT:针对非线性运动场景的高效多目标跟踪器 | motion prediction |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery | 提出HAC:利用人体网格作为标定板,实现世界坐标系下精确人体运动估计 | human mesh recovery human motion human motion estimation | ✅ |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models | 提出MMHalSnowball框架,揭示并缓解大视觉语言模型中多模态幻觉滚雪球效应 | multimodal |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction | DEAR:无需重构,解耦环境与智能体表征以提升强化学习样本效率 | manipulation reinforcement learning |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding | 提出ESGNN,用于3D场景理解的等变场景图神经网络 | scene understanding |