cs.CV(2025-11-18)

📊 共 35 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱三:空间感知 (Perception & SLAM) (26 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (5 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (1) 支柱八:物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱三:空间感知 (Perception & SLAM) (26 篇)

#题目一句话要点标签🔗
1 iGaussian: Real-Time Camera Pose Estimation via Feed-Forward 3D Gaussian Splatting Inversion 提出iGaussian以解决实时相机位姿估计问题 SLAM 3D gaussian splatting gaussian splatting
2 SparseSurf: Sparse-View 3D Gaussian Splatting for Surface Reconstruction SparseSurf:稀疏视图下基于3D高斯溅射的表面重建 3D gaussian splatting gaussian splatting novel view synthesis
3 RTS-Mono: A Real-Time Self-Supervised Monocular Depth Estimation Method for Real-World Deployment RTS-Mono:一种用于真实世界部署的实时自监督单目深度估计方法 depth estimation monocular depth navigation
4 Gaussian See, Gaussian Do: Semantic 3D Motion Transfer from Multiview Video 提出Gaussian See, Gaussian Do,实现多视角视频的语义3D动作迁移 3D gaussian splatting gaussian splatting motion transfer
5 EGSA-PT:Edge-Guided Spatial Attention with Progressive Training for Monocular Depth Estimation and Segmentation of Transparent Objects 提出EGSA-PT,通过边缘引导空间注意力和渐进式训练提升透明物体深度估计与分割性能 depth estimation monocular depth
6 Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving 提出纹理增强的3D物理对抗攻击,欺骗自动驾驶双目深度估计 depth estimation stereo matching
7 IBGS: Image-Based Gaussian Splatting 提出基于图像的高斯溅射,提升新视角合成质量,无需增加存储。 3D gaussian splatting 3DGS gaussian splatting
8 Dental3R: Geometry-Aware Pairing for Intraoral 3D Reconstruction from Sparse-View Photographs Dental3R:针对稀疏视角口腔照片,提出几何感知配对的3D重建方法 3D gaussian splatting 3DGS gaussian splatting
9 Enhancing Generalization of Depth Estimation Foundation Model via Weakly-Supervised Adaptation with Regularization 提出WeSTAR框架,通过弱监督自训练和正则化提升深度估计基础模型泛化能力 depth estimation monocular depth Depth Anything
10 Interaction-Aware 4D Gaussian Splatting for Dynamic Hand-Object Interaction Reconstruction 提出交互感知4D高斯溅射,用于动态手-物交互重建 3D gaussian splatting gaussian splatting
11 PuzzlePoles: Cylindrical Fiducial Markers Based on the PuzzleBoard Pattern 提出PuzzlePole圆柱形标志物,用于自主系统中的精确标定与定位 SLAM pose estimation localization
12 SLAM-AGS: Slide-Label Aware Multi-Task Pretraining Using Adaptive Gradient Surgery in Computational Cytology SLAM-AGS:计算细胞学中基于自适应梯度手术的Slide-Label感知多任务预训练 SLAM
13 Rethinking the Encoding and Annotating of 3D Bounding Box: Corner-Aware 3D Object Detection from Point Clouds 提出角点对齐回归的3D目标检测方法,解决中心对齐回归在LiDAR点云中的不稳定性问题 point cloud
14 Gaussian Splatting-based Low-Rank Tensor Representation for Multi-Dimensional Image Recovery 提出基于高斯溅射的低秩张量表示GSLR,用于多维图像恢复,提升局部高频信息捕捉能力。 gaussian splatting
15 V2VLoc: Robust GNSS-Free Collaborative Perception via LiDAR Localization 提出V2VLoc框架,通过激光雷达定位实现GNSS拒止环境下的鲁棒协同感知。 localization
16 SMGeo: Cross-View Object Geo-Localization with Grid-Level Mixture-of-Experts SMGeo:提出基于网格级混合专家模型的跨视角目标地理定位方法 localization
17 RISE: Single Static Radar-based Indoor Scene Understanding RISE:基于单静态雷达的室内场景理解,利用多径反射提升几何推理能力 scene understanding
18 3D Ground Truth Reconstruction from Multi-Camera Annotations Using UKF 提出一种基于UKF的多相机2D标注融合3D重建方法,用于自动驾驶等场景。 localization navigation
19 CPSL: Representing Volumetric Video via Content-Promoted Scene Layers 提出内容驱动的场景层CPSL,用于高效表示和渲染体积视频。 point cloud
20 Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers 提出Co-Me,加速视觉几何Transformer,无需重训练即可实现高达11.3倍的加速。 VGGT
21 A Quantitative Method for Shoulder Presentation Evaluation in Biometric Identity Documents 提出肩部姿态评估算法SPE,用于生物特征身份文件中肩部合规性自动检查。 pose estimation
22 O3SLM: Open Weight, Open Data, and Open Vocabulary Sketch-Language Model O3SLM:开放权重、数据和词汇的草图-语言模型,提升抽象视觉输入理解能力。 localization
23 NeuralSSD: A Neural Solver for Signed Distance Surface Reconstruction NeuralSSD:一种基于神经求解器的有向距离场表面重建方法 point cloud
24 Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution Orion:一个用于多模态感知、高级视觉推理和执行的统一视觉Agent localization
25 Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion Wave-Former:利用无线信号形状补全实现穿透遮挡的三维重建 point cloud
26 Multi-view Phase-aware Pedestrian-Vehicle Incident Reasoning Framework with Vision-Language Models 提出MP-PVIR框架,利用多视角和视觉-语言模型解决行人-车辆事故的推理问题 scene understanding

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
27 DoGCLR: Dominance-Game Contrastive Learning Network for Skeleton-Based Action Recognition 提出DoGCLR,通过支配博弈对比学习提升骨骼动作识别性能。 contrastive learning localization
28 X-WIN: Building Chest Radiograph World Model via Predictive Sensing X-WIN:通过预测感知构建胸部X光片世界模型 world model representation learning
29 Parameter Aware Mamba Model for Multi-task Dense Prediction 提出参数感知Mamba模型PAMM,用于多任务密集预测,提升任务间互联性。 Mamba state space model
30 GEN3D: Generating Domain-Free 3D Scenes from a Single Image GEN3D:提出一种从单张图像生成无领域限制的3D场景的方法 world model gaussian splatting point cloud
31 Text-Driven Reasoning Video Editing via Reinforcement Learning on Digital Twin Representations 提出RIVER模型,通过数字孪生和强化学习解决文本驱动的推理视频编辑任务。 reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
32 ManipShield: A Unified Framework for Image Manipulation Detection, Localization and Explanation 提出ManipShield,一个统一的图像篡改检测、定位和解释框架,并构建大规模基准测试集ManipBench。 manipulation localization
33 Error-Driven Scene Editing for 3D Grounding in Large Language Models 提出DEER-3D框架,通过误差驱动的场景编辑提升3D-LLM的空间理解能力 manipulation scene reconstruction scene understanding

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
34 Diffusion As Self-Distillation: End-to-End Latent Diffusion In One Model 提出DSD框架,实现端到端潜在扩散模型单网络训练,解决多阶段训练低效问题。 classifier-free guidance

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
35 Improving segmentation of retinal arteries and veins using cardiac signal in doppler holograms 利用心动信号增强多普勒全息图中视网膜动静脉分割 PULSE

⬅️ 返回 cs.CV 首页 · 🏠 返回主页