| 28 |
Joint Optimization for 4D Human-Scene Reconstruction in the Wild |
提出JOSH,用于野外单目视频中4D人体-场景联合重建 |
scene reconstruction human-scene interaction human mesh recovery |
|
|
| 29 |
Bridging Geometric and Semantic Foundation Models for Generalized Monocular Depth Estimation |
BriGeS:融合几何与语义基础模型,提升单目深度估计性能 |
depth estimation monocular depth foundation model |
|
|
| 30 |
Distractor-free Generalizable 3D Gaussian Splatting |
提出DGGS,解决通用3D高斯溅射中无干扰物体的场景重建问题 |
3D gaussian splatting 3DGS gaussian splatting |
|
|
| 31 |
Proxy-GS: Unified Occlusion Priors for Training and Inference in Structured 3D Gaussian Splatting |
Proxy-GS:利用统一遮挡先验加速结构化3D高斯溅射训练与推理 |
3D gaussian splatting 3DGS gaussian splatting |
|
|
| 32 |
AeroDGS: Physically Consistent Dynamic Gaussian Splatting for Single-Sequence Aerial 4D Reconstruction |
AeroDGS:面向单目航拍的物理一致动态高斯溅射4D重建 |
gaussian splatting splatting scene reconstruction |
|
|
| 33 |
GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views |
GIFSplat:基于生成先验的迭代式前馈3D高斯溅射,从稀疏视角重建 |
3D gaussian splatting gaussian splatting splatting |
|
|
| 34 |
Latent Gaussian Splatting for 4D Panoptic Occupancy Tracking |
提出潜在高斯喷涂方法以解决4D全景占用跟踪问题 |
gaussian splatting splatting scene understanding |
|
|
| 35 |
G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior |
G4Splat:利用生成先验和几何引导的高斯溅射,提升三维场景重建质量。 |
gaussian splatting splatting NeRF |
|
|
| 36 |
ST-GS: Vision-Based 3D Semantic Occupancy Prediction with Spatial-Temporal Gaussian Splatting |
提出ST-GS框架,利用时空高斯溅射提升视觉中心自动驾驶中的3D语义占据预测 |
gaussian splatting splatting scene understanding |
|
|
| 37 |
Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes |
提出基于单目视觉的室内场景开放词汇占据预测方法,提升复杂环境理解能力。 |
splatting open-vocabulary open vocabulary |
|
|
| 38 |
GSTurb: Gaussian Splatting for Atmospheric Turbulence Mitigation |
GSTurb:利用高斯溅射进行大气湍流缓解,提升长距离成像质量。 |
gaussian splatting splatting optical flow |
|
|
| 39 |
Pix2Key: Controllable Open-Vocabulary Retrieval with Semantic Decomposition and Self-Supervised Visual Dictionary Learning |
Pix2Key提出基于语义分解和自监督视觉字典学习的可控开放词汇图像检索方法 |
open-vocabulary open vocabulary |
|
|
| 40 |
Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation? |
提出检索增强的测试时适配器,以少量样本弥合开放词汇分割的监督差距。 |
open-vocabulary open vocabulary |
|
|
| 41 |
From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects |
提出OWEL和MSCAL,使开放词汇目标检测模型具备开放世界新物体检测能力 |
open-vocabulary open vocabulary |
|
|
| 42 |
BetterScene: 3D Scene Synthesis with Representation-Aligned Generative Model |
提出BetterScene以解决稀疏照片下的新视角合成问题 |
3D gaussian splatting 3DGS gaussian splatting |
|
|
| 43 |
SplatSDF: Boosting SDF-NeRF via Architecture-Level Fusion with Gaussian Splats |
SplatSDF:通过与高斯溅射架构级融合加速SDF-NeRF训练与收敛 |
3DGS NeRF |
|
|
| 44 |
SwiftNDC: Fast Neural Depth Correction for High-Fidelity 3D Reconstruction |
SwiftNDC:快速神经深度校正,实现高保真3D重建 |
3D gaussian splatting 3DGS gaussian splatting |
|
|
| 45 |
Instruction-based Image Editing with Planning, Reasoning, and Generation |
提出基于规划、推理和生成的指令图像编辑框架,提升复杂场景下的编辑质量。 |
scene understanding large language model chain-of-thought |
|
|
| 46 |
Unveiling Deep Shadows: A Survey and Benchmark on Image and Video Shadow Detection, Removal, and Generation in the Deep Learning Era |
深度学习时代阴影检测、去除与生成:统一综述、基准测试与未来方向 |
scene understanding foundation model multimodal |
|
|
| 47 |
Loc$^2$: Interpretable Cross-View Localization via Depth-Lifted Local Feature Matching |
提出Loc$^2$,通过深度提升的局部特征匹配实现可解释的跨视角定位 |
monocular depth feature matching |
|
|
| 48 |
DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces |
DICArt:提出基于离散扩散的铰接物体类别级姿态估计方法 |
6D pose estimation embodied AI |
|
|
| 49 |
Motion-aware Event Suppression for Event Cameras |
提出运动感知事件抑制框架,实时过滤事件相机中由独立运动物体和自运动引起的事件。 |
visual odometry IMoS |
|
|
| 50 |
PackUV: Packed Gaussian UV Maps for 4D Volumetric Video |
PackUV:提出基于UV图的紧凑型高斯表示,用于高效4D体积视频的存储与传输。 |
gaussian splatting splatting |
|
|
| 51 |
FLIGHT: Fibonacci Lattice-based Inference for Geometric Heading in real-Time |
FLIGHT:基于斐波那契格点推理的实时几何航向估计 |
visual odometry |
|
|
| 52 |
Velocity and stroke rate reconstruction of canoe sprint team boats based on panned and zoomed video recordings |
提出基于平移缩放视频的皮划艇速度和划桨率重建框架,无需船载传感器。 |
optical flow |
|
|
| 53 |
SuperQuadricOcc: Multi-Layer Gaussian Approximation of Superquadrics for Real-Time Self-Supervised Occupancy Estimation |
提出SuperQuadricOcc,利用超二次曲面实现实时自监督占据估计,显著降低内存占用。 |
scene understanding |
|
|