cs.CV(2024-06-30)

📊 共 8 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (2 🔗2) 支柱七:动作重定向 (Motion Retargeting) (2 🔗1) 支柱六:视频提取与匹配 (Video Extraction) (1 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (1) 支柱一:机器人控制 (Robot Control) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
1 CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation 提出CaFNet,利用雷达置信度提升雷达相机深度估计精度 MAE depth estimation
2 Diffusion Models and Representation Learning: A Survey 综述扩散模型与表征学习的交叉研究,探索其在视觉任务中的应用与潜力。 representation learning

🔬 支柱七:动作重定向 (Motion Retargeting) (2 篇)

#题目一句话要点标签🔗
3 HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis 提出HATs方法,用于全景病理图像中复杂解剖结构的分层自适应分割。 spatial relationship foundation model
4 Engineering an Efficient Object Tracker for Non-Linear Motion DeepMoveSORT:针对非线性运动场景的高效多目标跟踪器 motion prediction

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
5 Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery 提出HAC:利用人体网格作为标定板,实现世界坐标系下精确人体运动估计 human mesh recovery human motion human motion estimation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
6 Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models 提出MMHalSnowball框架,揭示并缓解大视觉语言模型中多模态幻觉滚雪球效应 multimodal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
7 DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction DEAR:无需重构,解耦环境与智能体表征以提升强化学习样本效率 manipulation reinforcement learning

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
8 ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding 提出ESGNN,用于3D场景理解的等变场景图神经网络 scene understanding

⬅️ 返回 cs.CV 首页 · 🏠 返回主页