cs.CV(2025-11-19)

📊 共 18 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱三:空间感知 (Perception & SLAM) (6 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (6) 支柱一:机器人控制 (Robot Control) (4) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱六:视频提取与匹配 (Video Extraction & Matching) (1 🔗1)

🔬 支柱三:空间感知 (Perception & SLAM) (6 篇)

#题目一句话要点标签🔗
1 Gaussian Blending: Rethinking Alpha Blending in 3D Gaussian Splatting 提出高斯混合:重新思考3D高斯溅射中的Alpha混合,提升新视角合成质量 3D gaussian splatting 3DGS gaussian splatting
2 WALDO: Where Unseen Model-based 6D Pose Estimation Meets Occlusion WALDO:提出一种新颖的基于模型的6D位姿估计方法,提升遮挡场景下的鲁棒性。 scene understanding pose estimation
3 SceneEdited: A City-Scale Benchmark for 3D HD Map Updating via Image-Guided Change Detection SceneEdited:提出城市级3D高清地图更新基准,通过图像引导的变更检测。 point cloud navigation
4 Computer-Use Agents as Judges for Generative User Interface 提出Coder-CUA协同框架,利用计算机代理辅助代码生成GUI的设计,提升任务解决能力。 navigation
5 ShelfOcc: Native 3D Supervision beyond LiDAR for Vision-Based Occupancy Estimation ShelfOcc:提出一种纯视觉的3D体素占据估计方法,无需激光雷达即可实现原生3D监督。 scene understanding
6 Adapt-As-You-Walk Through the Clouds: Training-Free Online Test-Time Adaptation of 3D Vision-Language Foundation Models 提出Uni-Adapter,一种免训练的3D视觉-语言模型在线测试时自适应方法。 point cloud

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
7 Text2Loc++: Generalizing 3D Point Cloud Localization from Natural Language Text2Loc++:提出一种基于自然语言的通用3D点云定位方法 contrastive learning point cloud localization
8 MambaTrack3D: A State Space Model Framework for LiDAR-Based Object Tracking under High Temporal Variation MambaTrack3D:基于状态空间模型的LiDAR高时间变化目标跟踪框架 Mamba state space model point cloud
9 Learning Depth from Past Selves: Self-Evolution Contrast for Robust Depth Estimation 提出自进化对比学习框架SEC-Depth,提升恶劣天气下自监督深度估计的鲁棒性 contrastive learning depth estimation
10 MambaIO: Global-Coordinate Inertial Odometry for Pedestrians via Multi-Scale Frequency-Decoupled Modeling MambaIO:面向行人惯性里程计的多尺度解耦建模方法 Mamba localization
11 Towards Unbiased Cross-Modal Representation Learning for Food Image-to-Recipe Retrieval 提出基于因果推断的解偏方法,提升食物图像-菜谱跨模态检索性能 representation learning
12 BokehFlow: Depth-Free Controllable Bokeh Rendering via Flow Matching 提出BokehFlow,一种基于Flow Matching的无深度信息可控焦外成像方法 flow matching

🔬 支柱一:机器人控制 (Robot Control) (4 篇)

#题目一句话要点标签🔗
13 GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization 提出GeoVista,一个基于Web增强的Agentic视觉推理模型,用于地理定位任务。 manipulation reinforcement learning localization
14 Box6D : Zero-shot Category-level 6D Pose Estimation of Warehouse Boxes Box6D:面向仓库箱体的零样本类别级6D位姿估计 manipulation pose estimation
15 CompTrack: Information Bottleneck-Guided Low-Rank Dynamic Token Compression for Point Cloud Tracking CompTrack:信息瓶颈引导的低秩动态Token压缩,用于点云单目标跟踪。 running point cloud
16 Adaptive thresholding pattern for fingerprint forgery detection 提出基于自适应阈值模式的指纹伪造检测算法,提升抗干扰能力。 manipulation

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
17 UniHOI: Unified Human-Object Interaction Understanding via Unified Token Space UniHOI:通过统一Token空间实现统一的人-物交互理解 human-object interaction HOI

🔬 支柱六:视频提取与匹配 (Video Extraction & Matching) (1 篇)

#题目一句话要点标签🔗
18 RoMa v2: Harder Better Faster Denser Feature Matching RoMa v2:通过架构、训练和优化,显著提升密集特征匹配的精度与速度。 feature matching

⬅️ 返回 cs.CV 首页 · 🏠 返回主页