cs.CV（2024-06-30）

📊 共 8 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (2 🔗2) 支柱七：动作重定向 (Motion Retargeting) (2 🔗1) 支柱六：视频提取与匹配 (Video Extraction) (1 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (1) 支柱一：机器人控制 (Robot Control) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
1	CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation	提出CaFNet，利用雷达置信度提升雷达相机深度估计精度	MAE depth estimation	✅
2	Diffusion Models and Representation Learning: A Survey	综述扩散模型与表征学习的交叉研究，探索其在视觉任务中的应用与潜力。	representation learning	✅

🔬 支柱七：动作重定向 (Motion Retargeting) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
3	HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis	提出HATs方法，用于全景病理图像中复杂解剖结构的分层自适应分割。	spatial relationship foundation model	✅
4	Engineering an Efficient Object Tracker for Non-Linear Motion	DeepMoveSORT：针对非线性运动场景的高效多目标跟踪器	motion prediction

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
5	Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery	提出HAC：利用人体网格作为标定板，实现世界坐标系下精确人体运动估计	human mesh recovery human motion human motion estimation	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
6	Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models	提出MMHalSnowball框架，揭示并缓解大视觉语言模型中多模态幻觉滚雪球效应	multimodal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
7	DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction	DEAR：无需重构，解耦环境与智能体表征以提升强化学习样本效率	manipulation reinforcement learning

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
8	ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding	提出ESGNN，用于3D场景理解的等变场景图神经网络	scene understanding

⬅️ 返回 cs.CV 首页 · 🏠 返回主页