| 1 |
Depth-Consistent 3D Gaussian Splatting via Physical Defocus Modeling and Multi-View Geometric Supervision |
提出基于物理散焦建模和多视角几何监督的深度一致性3D高斯溅射方法 |
depth estimation monocular depth 3D gaussian splatting |
|
|
| 2 |
AHA! Animating Human Avatars in Diverse Scenes with Gaussian Splatting |
提出基于高斯溅射的人体动画框架,实现场景中逼真的人体自由视角渲染。 |
3D gaussian splatting 3DGS gaussian splatting |
|
|
| 3 |
TSPE-GS: Probabilistic Depth Extraction for Semi-Transparent Surface Reconstruction via 3D Gaussian Splatting |
TSPE-GS:基于3D高斯溅射的半透明表面概率深度提取方法 |
3D gaussian splatting gaussian splatting |
|
|
| 4 |
OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer |
OmniVGGT:多模态驱动的视觉几何对齐Transformer,提升3D视觉任务性能 |
depth estimation point cloud pose estimation |
|
|
| 5 |
GFT: Graph Feature Tuning for Efficient Point Cloud Analysis |
提出图特征调优(GFT)方法,高效分析点云数据并显著降低参数量。 |
point cloud |
✅ |
|
| 6 |
MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation |
提出多模态3D场景图MSGNav,用于零样本具身导航 |
navigation |
|
|
| 7 |
RWKV-PCSSC: Exploring RWKV Model for Point Cloud Semantic Scene Completion |
提出RWKV-PCSSC,利用RWKV机制实现轻量高效的点云语义场景补全。 |
point cloud |
|
|
| 8 |
IPCD: Intrinsic Point-Cloud Decomposition |
提出IPCD,用于点云的本征分解,实现光照编辑和纹理修改等应用 |
point cloud |
|
|
| 9 |
Towards Blind and Low-Vision Accessibility of Lightweight VLMs and Custom LLM-Evals |
针对视障人士,评估轻量级VLM在视频理解中的可访问性,并提出定制化评估框架。 |
navigation social interaction |
|
|
| 10 |
RobIA: Robust Instance-aware Continual Test-time Adaptation for Deep Stereo |
提出RobIA框架,用于深度立体匹配中鲁棒的、实例感知的持续测试时自适应 |
depth estimation stereo depth |
|
|
| 11 |
LiNeXt: Revisiting LiDAR Completion with Efficient Non-Diffusion Architectures |
LiNeXt:提出高效非扩散架构,加速LiDAR点云补全并提升精度。 |
point cloud |
|
|
| 12 |
Split-Layer: Enhancing Implicit Neural Representation by Maximizing the Dimensionality of Feature Space |
提出Split-Layer以提升隐式神经表示的特征空间维度,增强表征能力 |
novel view synthesis |
|
|
| 13 |
Toward bilipshiz geometric models |
提出保持双利普希茨几何结构的3D点云神经网络模型 |
point cloud |
|
|
| 14 |
LoG3D: Ultra-High-Resolution 3D Shape Modeling via Local-to-Global Partitioning |
LoG3D:通过局部到全局分割实现超高分辨率3D形状建模 |
point cloud |
|
|
| 15 |
AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models |
AffordBot:利用多模态大语言模型实现细粒度3D具身推理 |
point cloud |
|
|
| 16 |
DBGroup: Dual-Branch Point Grouping for Weakly Supervised 3D Semantic Instance Segmentation |
提出DBGroup:双分支点云分组网络,用于弱监督3D语义实例分割 |
scene understanding |
✅ |
|
| 17 |
MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding |
提出MosaicDoc:一个大规模双语视觉文档理解基准,解决现有基准的局限性。 |
localization |
|
|
| 18 |
HCC-3D: Hierarchical Compensatory Compression for 98% 3D Token Reduction in Vision-Language Models |
提出HCC-3D,通过分层补偿压缩实现3D视觉语言模型中98%的Token缩减 |
point cloud |
|
|