3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection

作者: Yang Cao, Yuanliang Jv, Dan Xu

分类: cs.CV

发布日期: 2024-10-02

备注: Code Page: https://github.com/yangcaoai/3DGS-DET

💡 一句话要点

3DGS-DET：利用边界引导和Box聚焦采样增强3D高斯溅射用于3D目标检测

🎯 匹配领域: 支柱三：空间感知与语义 (Perception & Semantics)

关键词: 3D目标检测 高斯溅射 神经辐射场 边界引导 Box聚焦采样

📋 核心要点

NeRF在3D目标检测中受限于隐式表达能力和渲染速度，难以有效区分目标与背景。
提出3DGS-DET，利用2D边界引导增强高斯blob空间分布，并采用Box聚焦采样减少背景噪声。
实验表明，3DGS-DET在ScanNet和ARKITScenes数据集上显著优于NeRF-Det，mAP指标提升明显。

📝 摘要（中文）

神经辐射场(NeRF)广泛应用于新视角合成，并已被应用于3D目标检测(3DOD)，通过视角合成表示为3DOD提供了一种有前景的方法。然而，NeRF面临着固有的局限性：(i)由于其隐式性质，对3DOD的表征能力有限，以及(ii)渲染速度慢。最近，3D高斯溅射(3DGS)作为一种显式的3D表示出现，解决了这些限制。受这些优势的启发，本文首次将3DGS引入3DOD，并确定了两个主要挑战：(i)高斯blob的空间分布模糊：3DGS主要依赖于2D像素级监督，导致高斯blob的3D空间分布不清晰，对象和背景之间的区分度较差，这阻碍了3DOD；(ii)过多的背景blob：2D图像通常包含大量的背景像素，导致密集重建的3DGS包含许多代表背景的噪声高斯blob，对检测产生负面影响。为了解决挑战(i)，我们利用3DGS重建源于2D图像这一事实，并提出了一种优雅而有效的解决方案，即结合2D边界引导，以显著增强高斯blob的空间分布，从而更清楚地区分对象及其背景。为了解决挑战(ii)，我们提出了一种Box聚焦采样策略，使用2D框在3D空间中生成对象概率分布，从而在3D中进行有效的概率采样，以保留更多的对象blob并减少噪声背景blob。受益于我们的设计，我们的3DGS-DET显著优于基于SOTA NeRF的方法NeRF-Det，在ScanNet数据集上，mAP@0.25提高了+6.6，mAP@0.5提高了+8.1，在ARKITScenes数据集上，mAP@0.25提高了惊人的+31.5。

🔬 方法详解

问题定义：现有的基于NeRF的3D目标检测方法，由于NeRF的隐式表达方式，导致其表征能力有限，难以准确区分目标和背景。此外，NeRF的渲染速度较慢，限制了其在实际应用中的效率。3DGS虽然是一种显式表达，但直接应用于3D目标检测时，会面临高斯blob空间分布模糊以及背景噪声过多的问题。

核心思路：本文的核心思路是利用2D图像的先验信息来指导3DGS的重建过程，从而提高3D目标检测的性能。具体来说，通过引入2D边界引导来增强高斯blob的空间分布，使其更清晰地表达目标边界。同时，采用Box聚焦采样策略，减少背景噪声，提高目标blob的采样概率。

技术框架：3DGS-DET的整体框架可以分为以下几个阶段：1) 3DGS重建：使用多视角图像重建3DGS场景。2) 边界引导：利用2D目标检测框的边界信息，增强对应3D高斯blob的空间分布。3) Box聚焦采样：根据2D目标检测框生成3D空间中的目标概率分布，并进行概率采样，保留更多目标blob，减少背景blob。4) 3D目标检测：利用重建后的3DGS场景进行3D目标检测。

关键创新：该论文的关键创新在于将2D边界引导和Box聚焦采样策略引入到基于3DGS的3D目标检测中。与现有的NeRF-based方法相比，该方法利用显式的3DGS表达，并结合2D图像的先验信息，有效提高了目标检测的精度和效率。

关键设计：在边界引导方面，论文使用2D目标检测框的边界信息来调整对应3D高斯blob的位置和形状，使其更贴合目标边界。在Box聚焦采样方面，论文使用2D目标检测框生成3D空间中的目标概率分布，并根据该分布进行采样，保留更多目标blob，减少背景blob。具体的损失函数和网络结构细节在论文中有详细描述。

📊 实验亮点

实验结果表明，3DGS-DET在ScanNet数据集上，mAP@0.25指标提升了6.6%，mAP@0.5指标提升了8.1%。在ARKITScenes数据集上，mAP@0.25指标提升了高达31.5%。这些结果表明，该方法能够显著提高3D目标检测的性能，优于现有的NeRF-based方法。

🎯 应用场景

该研究成果可应用于机器人导航、自动驾驶、增强现实等领域。通过提高3D目标检测的精度和效率，可以帮助机器人更好地理解周围环境，从而实现更智能的交互和决策。此外，该方法还可以应用于三维场景重建、虚拟现实等领域，具有广泛的应用前景和实际价值。

📄 摘要（原文）

Neural Radiance Fields (NeRF) are widely used for novel-view synthesis and have been adapted for 3D Object Detection (3DOD), offering a promising approach to 3DOD through view-synthesis representation. However, NeRF faces inherent limitations: (i) limited representational capacity for 3DOD due to its implicit nature, and (ii) slow rendering speeds. Recently, 3D Gaussian Splatting (3DGS) has emerged as an explicit 3D representation that addresses these limitations. Inspired by these advantages, this paper introduces 3DGS into 3DOD for the first time, identifying two main challenges: (i) Ambiguous spatial distribution of Gaussian blobs: 3DGS primarily relies on 2D pixel-level supervision, resulting in unclear 3D spatial distribution of Gaussian blobs and poor differentiation between objects and background, which hinders 3DOD; (ii) Excessive background blobs: 2D images often include numerous background pixels, leading to densely reconstructed 3DGS with many noisy Gaussian blobs representing the background, negatively affecting detection. To tackle the challenge (i), we leverage the fact that 3DGS reconstruction is derived from 2D images, and propose an elegant and efficient solution by incorporating 2D Boundary Guidance to significantly enhance the spatial distribution of Gaussian blobs, resulting in clearer differentiation between objects and their background. To address the challenge (ii), we propose a Box-Focused Sampling strategy using 2D boxes to generate object probability distribution in 3D spaces, allowing effective probabilistic sampling in 3D to retain more object blobs and reduce noisy background blobs. Benefiting from our designs, our 3DGS-DET significantly outperforms the SOTA NeRF-based method, NeRF-Det, achieving improvements of +6.6 on mAP@0.25 and +8.1 on mAP@0.5 for the ScanNet dataset, and impressive +31.5 on mAP@0.25 for the ARKITScenes dataset.

3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理