cs.CV(2025-11-11)

📊 共 33 篇论文 | 🔗 12 篇有代码

🎯 兴趣领域导航

支柱三:空间感知 (Perception & SLAM) (18 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (8 🔗6) 支柱一:机器人控制 (Robot Control) (6 🔗1) 支柱四:生成式动作 (Generative Motion) (1 🔗1)

🔬 支柱三:空间感知 (Perception & SLAM) (18 篇)

#题目一句话要点标签🔗
1 Perceptual Quality Assessment of 3D Gaussian Splatting: A Subjective Dataset and Prediction Metric 提出3DGS-QA数据集与无参考质量评估模型,解决3D高斯溅射感知质量评估问题 3D gaussian splatting 3DGS gaussian splatting
2 RAPTR: Radar-based 3D Pose Estimation using Transformer RAPTR:利用Transformer的雷达3D人体姿态估计,使用弱监督学习。 pose estimation 3D pose estimation
3 WEDepth: Efficient Adaptation of World Knowledge for Monocular Depth Estimation WEDepth:高效利用世界知识自适应单目深度估计 depth estimation monocular depth
4 UltraGS: Gaussian Splatting for Ultrasound Novel View Synthesis UltraGS:用于超声新视角合成的高斯溅射方法 gaussian splatting novel view synthesis
5 SkelSplat: Robust Multi-view 3D Human Pose Estimation with Differentiable Gaussian Rendering SkelSplat:基于可微高斯渲染的鲁棒多视角3D人体姿态估计 gaussian splatting scene reconstruction pose estimation
6 EAGLE: Episodic Appearance- and Geometry-aware Memory for Unified 2D-3D Visual Query Localization in Egocentric Vision EAGLE:基于情景外观和几何感知的记忆,用于以自我为中心的视觉查询定位 localization VGGT
7 DT-NVS: Diffusion Transformers for Novel View Synthesis 提出DT-NVS,利用Transformer的3D扩散模型实现真实场景的新视角合成 novel view synthesis
8 Adaptive graph Kolmogorov-Arnold network for 3D human pose estimation 提出PoseKAN:一种自适应图Kolmogorov-Arnold网络,用于3D人体姿态估计。 pose estimation
9 Accurate and Efficient Surface Reconstruction from Point Clouds via Geometry-Aware Local Adaptation 提出基于几何感知的局部自适应点云表面重建方法,提升精度与效率 point cloud
10 DANCE: Density-agnostic and Class-aware Network for Point Cloud Completion DANCE:一种密度无关且类别感知的点云补全网络 point cloud
11 Is It Truly Necessary to Process and Fit Minutes-Long Reference Videos for Personalized Talking Face Generation? 提出ISExplore策略,加速个性化说话人脸生成,减少参考视频处理时长。 3DGS NeRF
12 Enhancing Rotation-Invariant 3D Learning with Global Pose Awareness and Attention Mechanisms 提出SiPF和RIAttnConv,增强旋转不变3D学习的全局姿态感知和区分能力 point cloud
13 Top2Ground: A Height-Aware Dual Conditioning Diffusion Model for Robust Aerial-to-Ground View Generation 提出Top2Ground,一种高程感知双重条件扩散模型,用于稳健的航拍图到地视图生成。 height map
14 Pixel-level Quality Assessment for Oriented Object Detection 提出像素级质量评估PQA,解决有向目标检测中IoU预测的结构耦合问题。 localization
15 WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting WarpGAN:基于形变引导和风格化视角补全的3D GAN反演 novel view synthesis
16 VLMDiff: Leveraging Vision-Language Models for Multi-Class Anomaly Detection with Diffusion VLMDiff:利用视觉-语言模型和扩散模型进行多类别异常检测 localization
17 Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views Sparse3DPR:一种基于稀疏RGB视图的无训练3D场景分层解析与任务自适应子图推理框架 scene understanding
18 Cross Modal Fine-Grained Alignment via Granularity-Aware and Region-Uncertain Modeling 提出粒度感知和区域不确定性建模的跨模态细粒度对齐方法 navigation

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
19 CloudMamba: Grouped Selective State Spaces for Point Cloud Analysis CloudMamba:面向点云分析的分组选择性状态空间模型,显著降低计算复杂度并提升性能。 Mamba SSM state space model
20 ReIDMamba: Learning Discriminative Features with Visual State Space Model for Person Re-Identification 提出ReIDMamba,利用视觉状态空间模型学习判别性特征,实现高效行人重识别 Mamba state space model
21 Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning 提出DiPVNet,通过原子点积算子实现旋转不变的点云分层方向感知学习 representation learning point cloud
22 Non-Aligned Reference Image Quality Assessment for Novel View Synthesis 提出NAR-IQA框架,用于解决新视角合成中非对齐参考图像的质量评估问题 contrastive learning novel view synthesis
23 Multi-Modal Assistance for Unsupervised Domain Adaptation on Point Cloud 3D Object Detection 提出MMAssist,利用多模态信息辅助LiDAR点云3D目标检测的无监督域自适应。 teacher-student point cloud
24 3D4D: An Interactive, Editable, 4D World Model via 3D Video Generation 3D4D:通过3D视频生成实现交互式、可编辑的4D世界模型 world model
25 DI3CL: Contrastive Learning With Dynamic Instances and Contour Consistency for SAR Land-Cover Classification Foundation Model 提出DI3CL框架,利用动态实例和轮廓一致性对比学习,构建SAR地物分类基础模型。 contrastive learning
26 Compression then Matching: An Efficient Pre-training Paradigm for Multimodal Embedding 提出CoMa:一种高效的多模态嵌入预训练范式,提升视觉-语言模型性能。 representation learning contrastive learning

🔬 支柱一:机器人控制 (Robot Control) (6 篇)

#题目一句话要点标签🔗
27 RePose-NeRF: Robust Radiance Fields for Mesh Reconstruction under Noisy Camera Poses RePose-NeRF:提出一种鲁棒的辐射场方法,用于在噪声相机位姿下进行网格重建 manipulation NeRF navigation
28 Multi-modal Deepfake Detection and Localization with FPN-Transformer 提出基于FPN-Transformer的多模态深度伪造检测与定位框架,提升跨模态泛化能力和时序边界回归精度。 manipulation localization
29 Generating Sketches in a Hierarchical Auto-Regressive Process for Flexible Sketch Drawing Manipulation at Stroke-Level 提出一种分层自回归草图生成方法,实现笔画级灵活操控 manipulation
30 Retrospective motion correction in MRI using disentangled embeddings 提出基于解耦嵌入的MRI运动伪影矫正方法,提升模型泛化性。 whole-body motion
31 I2E: Real-Time Image-to-Event Conversion for High-Performance Spiking Neural Networks I2E:用于高性能脉冲神经网络的实时图像到事件转换框架 sim-to-real
32 Generalized-Scale Object Counting with Gradual Query Aggregation GECO2:通过渐进式查询聚合实现广义尺度目标计数 running

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
33 Human Motion Synthesis in 3D Scenes via Unified Scene Semantic Occupancy 提出SSOMotion,利用统一场景语义占据表示进行3D场景中的人体运动合成。 motion synthesis

⬅️ 返回 cs.CV 首页 · 🏠 返回主页