cs.CV(2025-12-04)

📊 共 29 篇论文 | 🔗 8 篇有代码

🎯 兴趣领域导航

支柱三:空间感知 (Perception & SLAM) (15 🔗4) 支柱一:机器人控制 (Robot Control) (6 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (5) 支柱四:生成式动作 (Generative Motion) (3 🔗1)

🔬 支柱三:空间感知 (Perception & SLAM) (15 篇)

#题目一句话要点标签🔗
1 Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization 提出AionHMR框架,实现年龄包容的3D人体网格重建,用于保护隐私的数据匿名化。 pose estimation human mesh recovery SMPL
2 RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS RobustSplat++:解耦3DGS的稠密化、动态和光照,实现野外场景鲁棒建模 3D gaussian splatting 3DGS gaussian splatting
3 Gaussian Entropy Fields: Driving Adaptive Sparsity in 3D Gaussian Optimization 提出高斯熵场以驱动3D高斯优化中的自适应稀疏性 3D gaussian splatting 3DGS gaussian splatting
4 MAFNet:Multi-frequency Adaptive Fusion Network for Real-time Stereo Matching 提出MAFNet,通过多频自适应融合网络实现实时高精度立体匹配 stereo matching disparity estimation
5 LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging LiteVGGT:通过几何感知缓存Token合并加速VGGT,实现大规模场景高效3D重建。 VGGT
6 Equivariant symmetry-aware head pose estimation for fetal MRI 提出E(3)-Pose,解决胎儿MRI中对称感知的头部姿态估计问题 pose estimation
7 Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing 提出BioTUCH,结合生物阻抗感知优化自接触场景下的人体姿态伪标签。 pose estimation motion generation
8 Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting Splannequin:利用双重检测 Splatting 冻结单目人体雕塑挑战视频 gaussian splatting scene reconstruction
9 4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer 提出4DLangVGGT,用于高效且可泛化的4D语言-视觉几何对齐 gaussian splatting scene understanding
10 Light-X: Generative 4D Video Rendering with Camera and Illumination Control Light-X:提出可控相机与光照的生成式4D视频渲染框架 point cloud
11 A dynamic memory assignment strategy for dilation-based ICP algorithm on embedded GPUs 针对嵌入式GPU,提出动态内存分配策略优化VANICP点云配准算法。 point cloud
12 Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition 提出基于门控机制的多模态自适应融合网络,提升人类行为识别精度 optical flow
13 You Only Train Once (YOTO): A Retraining-Free Object Detection Framework 提出YOTO框架,解决目标检测中免重训练的新品增量学习问题 localization
14 Denoise to Track: Harnessing Video Diffusion Priors for Robust Correspondence 提出HeFT,利用视频扩散先验实现鲁棒的零样本点跟踪 localization
15 Malicious Image Analysis via Vision-Language Segmentation Fusion: Detection, Element, and Location in One-shot 提出基于视觉-语言分割融合的恶意图像分析方法,实现一步到位的内容检测、元素识别和定位。 localization

🔬 支柱一:机器人控制 (Robot Control) (6 篇)

#题目一句话要点标签🔗
16 X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale X-Humanoid:通过机器人化人类视频大规模生成类人机器人视频 humanoid humanoid robot world model
17 Explainable Parkinsons Disease Gait Recognition Using Multimodal RGB-D Fusion and Large Language Models 提出基于RGB-D融合和LLM的可解释帕金森步态识别框架 gait
18 FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via Neural Action Tokenization FASTer:通过神经动作标记化实现高效的自回归视觉-语言-动作建模 manipulation cross-embodiment
19 Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints 提出基于生成先验和接触约束的物体遮挡重建方法,提升机器人操作性能。 manipulation
20 BulletTime: Decoupled Control of Time and Camera Pose for Video Generation BulletTime:解耦时间和相机姿态控制的视频生成框架 manipulation
21 Towards Cross-View Point Correspondence in Vision-Language Models 提出CrossPoint-Bench和CroPond模型,解决视觉语言模型中跨视角点对应难题。 manipulation

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
22 Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks 提出稳定单像素对比学习方法,用于语义和几何任务 contrastive learning teacher-student
23 DuGI-MAE: Improving Infrared Mask Autoencoders via Dual-Domain Guidance DuGI-MAE:通过双域引导改进红外图像掩码自编码器性能 masked autoencoder MAE
24 Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning Semore:VLM引导的增强语义运动表征用于视觉强化学习 reinforcement learning
25 ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching ReflexFlow:通过反思式优化Flow Matching学习目标,缓解生成模型的暴露偏差 flow matching
26 Fourier-Attentive Representation Learning: A Fourier-Guided Framework for Few-Shot Generalization in Vision-Language Models 提出FARL框架,利用傅里叶分析解耦视觉表征,提升视觉-语言模型在少样本学习中的泛化能力。 representation learning

🔬 支柱四:生成式动作 (Generative Motion) (3 篇)

#题目一句话要点标签🔗
27 Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model 研究运动扩散模型中运动表征对人体运动生成的影响,并提出优化建议。 motion diffusion model MDM motion diffusion
28 Controllable Long-term Motion Generation with Extended Joint Targets COMET:基于Transformer的实时可控长时程人体运动生成框架 motion generation character control
29 Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image 提出MoRe4D,联合进行3D几何重建和运动生成,从单张图像合成4D场景。 motion generation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页