cs.CV(2025-12-06)
📊 共 18 篇论文 | 🔗 6 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (7 🔗2)
支柱三:空间感知与语义 (Perception & Semantics) (4 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (3 🔗1)
支柱七:动作重定向 (Motion Retargeting) (2 🔗1)
支柱四:生成式动作 (Generative Motion) (1 🔗1)
支柱六:视频提取与匹配 (Video Extraction) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting | TriaGS:通过可微三角测量引导几何一致性的3D高斯溅射 | 3D gaussian splatting gaussian splatting splatting | ||
| 9 | AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars | AGORA:提出基于对抗生成网络的实时可控3D高斯头部头像 | 3D gaussian splatting 3DGS gaussian splatting | ✅ | |
| 10 | GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose Estimation | GNC-Pose:结合几何感知的GNC-PnP方法,实现精确的6D位姿估计 | 6D pose estimation feature matching | ||
| 11 | HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos | 提出HuPrior3R,融合人体先验知识,提升单目视频三维动态重建效果 | depth estimation monocular depth SMPL |
🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning | 提出VG-Refiner,通过Agent强化学习优化工具反馈,提升指代 grounding 推理能力 | reinforcement learning multimodal | ||
| 13 | ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models | ReCAD:利用强化学习增强的参数化CAD模型生成,基于视觉-语言模型 | reinforcement learning multimodal | ||
| 14 | S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening | 提出S2WMamba,通过谱-空域小波变换和Mamba模块实现高效遥感图像融合 | Mamba | ✅ |
🔬 支柱七:动作重定向 (Motion Retargeting) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation | 提出基于时空特性的事件相机人体姿态估计方法,提升效率与精度 | human motion spatiotemporal | ||
| 16 | VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System | VAD-Net:在智能教育系统中进行多维度面部表情识别,提出VAD标注并引入正交卷积。 | motion prediction | ✅ |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 17 | DragMesh: Interactive 3D Generation Made Easy | DragMesh:提出解耦运动生成框架,实现实时交互式3D物体可动性生成 | motion generation | ✅ |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 18 | Opinion: Learning Intuitive Physics May Require More than Visual Data | 研究表明,仅凭大量视觉数据或类儿童视角数据难以使模型掌握直观物理 | egocentric |