cs.CV(2025-12-05)

📊 共 21 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱三:空间感知 (Perception & SLAM) (13 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (4 🔗1) 支柱八:物理动画 (Physics-based Animation) (2) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱三:空间感知 (Perception & SLAM) (13 篇)

#题目一句话要点标签🔗
1 Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature Learning 提出基于流形感知的点云补全框架,通过测地线注意力机制提升几何一致性。 point cloud geometric consistency
2 See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors 提出基于单目深度先验的无训练手术场景分割方法DepSeg depth estimation monocular depth
3 Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth 提出曲率正则化VAE,用于从稀疏深度数据重建3D场景 scene reconstruction
4 Label-Efficient Point Cloud Segmentation with Active Learning 提出基于2D网格划分和网络集成的点云主动学习分割方法,提升标注效率。 point cloud
5 TED-4DGS: Temporally Activated and Embedding-based Deformation for 4DGS Compression 提出TED-4DGS,用于动态3D高斯溅射压缩,实现率失真优化。 3D gaussian splatting 3DGS gaussian splatting
6 YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications 提出YOLO与SGBM融合框架,用于辐射松修剪中树枝的自主检测与深度估计 depth estimation
7 SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training 提出SplatPainter以解决3D高斯模型交互编辑问题 3D gaussian splatting gaussian splatting
8 Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction 提出基于近似3D几何和光照方向的物理约束阴影检测方法 scene understanding
9 Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation 提出Track4DGen,利用跟踪引导的运动先验实现高质量3D模型动画生成。 gaussian splatting
10 Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light Shoot-Bounce-3D:利用单光子激光雷达和双次反射光进行遮挡感知的三维重建 scene reconstruction
11 EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing EgoEdit:用于第一人称视频编辑的数据集、实时模型与评测基准 ego-motion
12 Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding 提出ZoomClick,利用缩放先验提升GUI界面元素定位性能 localization
13 NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projections NormalView:一种基于几何投影的传感器无关树种分类方法 point cloud

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
14 Representation Learning for Point Cloud Understanding 提出一种融合2D预训练模型的3D点云表示学习方法,提升点云理解能力 representation learning point cloud
15 World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty 提出C3方法,为可控视频生成模型提供校准的不确定性估计,缓解幻觉问题。 world model
16 Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling 提出ViSA框架,通过空间断言改进世界模型在空间推理中的测试时缩放效果 world model
17 Training Multi-Image Vision Agents via End2End Reinforcement Learning 提出IMAgent,通过端到端强化学习训练多图视觉Agent,解决复杂多图QA任务。 reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
18 SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations SCAIL:通过3D一致姿态表示的上下文学习实现工作室级角色动画 character animation
19 Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token Prediction Fast SceneScript:通过多Token预测实现高效精确的结构化语言模型,用于3D场景布局估计。 ASE

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
20 Explainable Adversarial-Robust Vision-Language-Action Model for Robotic Manipulation 提出可解释的对抗鲁棒视觉-语言-动作模型,用于提升机器人操作在智能农业中的鲁棒性。 manipulation
21 LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection LeAD-M3D:利用非对称蒸馏实现实时单目3D目标检测 running

⬅️ 返回 cs.CV 首页 · 🏠 返回主页