cs.CV（2024-09-23）

📊 共 7 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (3 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (3 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (1 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	ReVLA: Reverting Visual Domain Limitation of Robotic Foundation Models	ReVLA：通过模型融合逆转机器人视觉基础模型的视觉领域限制	vision-language-action large language model foundation model
2	Dynamic Realms: 4D Content Analysis, Recovery and Generation with Geometric, Topological and Physical Priors	利用几何、拓扑和物理先验实现高效高质量的4D内容分析、恢复与生成	embodied AI
3	FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension	提出FineCops-Ref数据集与任务，用于细粒度组合指代表达式理解，挑战多模态大模型	large language model	✅

🔬 支柱三：空间感知与语义 (Perception & Semantics) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
4	GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth	GroCo：提出基于地面约束的自监督单目深度估计方法，提升尺度恢复和泛化性。	depth estimation monocular depth metric depth
5	FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera	FisheyeDepth：为鱼眼相机设计的真实尺度自监督深度估计模型	depth estimation	✅
6	Human Hair Reconstruction with Strand-Aligned 3D Gaussians	提出Gaussian Haircut，利用对齐发丝的3D高斯重建逼真发型	3D gaussian splatting gaussian splatting splatting

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
7	TextToon: Real-Time Text Toonify Head Avatar from Single Video	TextToon：提出一种基于单目视频的实时文本驱动卡通头像生成方法	contrastive learning 3D gaussian splatting gaussian splatting	✅

⬅️ 返回 cs.CV 首页 · 🏠 返回主页