cs.CV（2025-11-11）

📊 共 33 篇论文 | 🔗 12 篇有代码

🎯 兴趣领域导航

支柱三：空间感知 (Perception & SLAM) (18 🔗4) 支柱二：RL算法与架构 (RL & Architecture) (8 🔗6) 支柱一：机器人控制 (Robot Control) (6 🔗1) 支柱四：生成式动作 (Generative Motion) (1 🔗1)

🔬 支柱三：空间感知 (Perception & SLAM) (18 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Perceptual Quality Assessment of 3D Gaussian Splatting: A Subjective Dataset and Prediction Metric	提出3DGS-QA数据集与无参考质量评估模型，解决3D高斯溅射感知质量评估问题	3D gaussian splatting 3DGS gaussian splatting	✅
2	RAPTR: Radar-based 3D Pose Estimation using Transformer	RAPTR：利用Transformer的雷达3D人体姿态估计，使用弱监督学习。	pose estimation 3D pose estimation	✅
3	WEDepth: Efficient Adaptation of World Knowledge for Monocular Depth Estimation	WEDepth：高效利用世界知识自适应单目深度估计	depth estimation monocular depth
4	UltraGS: Gaussian Splatting for Ultrasound Novel View Synthesis	UltraGS：用于超声新视角合成的高斯溅射方法	gaussian splatting novel view synthesis	✅
5	SkelSplat: Robust Multi-view 3D Human Pose Estimation with Differentiable Gaussian Rendering	SkelSplat：基于可微高斯渲染的鲁棒多视角3D人体姿态估计	gaussian splatting scene reconstruction pose estimation
6	EAGLE: Episodic Appearance- and Geometry-aware Memory for Unified 2D-3D Visual Query Localization in Egocentric Vision	EAGLE：基于情景外观和几何感知的记忆，用于以自我为中心的视觉查询定位	localization VGGT
7	DT-NVS: Diffusion Transformers for Novel View Synthesis	提出DT-NVS，利用Transformer的3D扩散模型实现真实场景的新视角合成	novel view synthesis
8	Adaptive graph Kolmogorov-Arnold network for 3D human pose estimation	提出PoseKAN：一种自适应图Kolmogorov-Arnold网络，用于3D人体姿态估计。	pose estimation
9	Accurate and Efficient Surface Reconstruction from Point Clouds via Geometry-Aware Local Adaptation	提出基于几何感知的局部自适应点云表面重建方法，提升精度与效率	point cloud
10	DANCE: Density-agnostic and Class-aware Network for Point Cloud Completion	DANCE：一种密度无关且类别感知的点云补全网络	point cloud
11	Is It Truly Necessary to Process and Fit Minutes-Long Reference Videos for Personalized Talking Face Generation?	提出ISExplore策略，加速个性化说话人脸生成，减少参考视频处理时长。	3DGS NeRF
12	Enhancing Rotation-Invariant 3D Learning with Global Pose Awareness and Attention Mechanisms	提出SiPF和RIAttnConv，增强旋转不变3D学习的全局姿态感知和区分能力	point cloud
13	Top2Ground: A Height-Aware Dual Conditioning Diffusion Model for Robust Aerial-to-Ground View Generation	提出Top2Ground，一种高程感知双重条件扩散模型，用于稳健的航拍图到地视图生成。	height map
14	Pixel-level Quality Assessment for Oriented Object Detection	提出像素级质量评估PQA，解决有向目标检测中IoU预测的结构耦合问题。	localization
15	WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting	WarpGAN：基于形变引导和风格化视角补全的3D GAN反演	novel view synthesis
16	VLMDiff: Leveraging Vision-Language Models for Multi-Class Anomaly Detection with Diffusion	VLMDiff：利用视觉-语言模型和扩散模型进行多类别异常检测	localization	✅
17	Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views	Sparse3DPR：一种基于稀疏RGB视图的无训练3D场景分层解析与任务自适应子图推理框架	scene understanding
18	Cross Modal Fine-Grained Alignment via Granularity-Aware and Region-Uncertain Modeling	提出粒度感知和区域不确定性建模的跨模态细粒度对齐方法	navigation

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
19	CloudMamba: Grouped Selective State Spaces for Point Cloud Analysis	CloudMamba：面向点云分析的分组选择性状态空间模型，显著降低计算复杂度并提升性能。	Mamba SSM state space model
20	ReIDMamba: Learning Discriminative Features with Visual State Space Model for Person Re-Identification	提出ReIDMamba，利用视觉状态空间模型学习判别性特征，实现高效行人重识别	Mamba state space model	✅
21	Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning	提出DiPVNet，通过原子点积算子实现旋转不变的点云分层方向感知学习	representation learning point cloud	✅
22	Non-Aligned Reference Image Quality Assessment for Novel View Synthesis	提出NAR-IQA框架，用于解决新视角合成中非对齐参考图像的质量评估问题	contrastive learning novel view synthesis	✅
23	Multi-Modal Assistance for Unsupervised Domain Adaptation on Point Cloud 3D Object Detection	提出MMAssist，利用多模态信息辅助LiDAR点云3D目标检测的无监督域自适应。	teacher-student point cloud	✅
24	3D4D: An Interactive, Editable, 4D World Model via 3D Video Generation	3D4D：通过3D视频生成实现交互式、可编辑的4D世界模型	world model	✅
25	DI3CL: Contrastive Learning With Dynamic Instances and Contour Consistency for SAR Land-Cover Classification Foundation Model	提出DI3CL框架，利用动态实例和轮廓一致性对比学习，构建SAR地物分类基础模型。	contrastive learning	✅
26	Compression then Matching: An Efficient Pre-training Paradigm for Multimodal Embedding	提出CoMa：一种高效的多模态嵌入预训练范式，提升视觉-语言模型性能。	representation learning contrastive learning

🔬 支柱一：机器人控制 (Robot Control) (6 篇)

#	题目	一句话要点	标签	🔗	⭐
27	RePose-NeRF: Robust Radiance Fields for Mesh Reconstruction under Noisy Camera Poses	RePose-NeRF：提出一种鲁棒的辐射场方法，用于在噪声相机位姿下进行网格重建	manipulation NeRF navigation
28	Multi-modal Deepfake Detection and Localization with FPN-Transformer	提出基于FPN-Transformer的多模态深度伪造检测与定位框架，提升跨模态泛化能力和时序边界回归精度。	manipulation localization	✅
29	Generating Sketches in a Hierarchical Auto-Regressive Process for Flexible Sketch Drawing Manipulation at Stroke-Level	提出一种分层自回归草图生成方法，实现笔画级灵活操控	manipulation
30	Retrospective motion correction in MRI using disentangled embeddings	提出基于解耦嵌入的MRI运动伪影矫正方法，提升模型泛化性。	whole-body motion
31	I2E: Real-Time Image-to-Event Conversion for High-Performance Spiking Neural Networks	I2E：用于高性能脉冲神经网络的实时图像到事件转换框架	sim-to-real
32	Generalized-Scale Object Counting with Gradual Query Aggregation	GECO2：通过渐进式查询聚合实现广义尺度目标计数	running

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
33	Human Motion Synthesis in 3D Scenes via Unified Scene Semantic Occupancy	提出SSOMotion，利用统一场景语义占据表示进行3D场景中的人体运动合成。	motion synthesis	✅

⬅️ 返回 cs.CV 首页 · 🏠 返回主页