RANRAC: Robust Neural Scene Representations via Random Ray Consensus

📄 arXiv: 2312.09780v2 📥 PDF

作者: Benno Buschmann, Andreea Dogaru, Elmar Eisemann, Michael Weinmann, Bernhard Egger

分类: cs.CV

发布日期: 2023-12-15 (更新: 2024-04-19)


💡 一句话要点

提出RANRAC以解决图像不一致性问题

🎯 匹配领域: 支柱三:空间感知与语义 (Perception & Semantics)

关键词: 神经辐射场 光场网络 鲁棒重建 多视图重建 计算机视觉 RANSAC 图像处理

📋 核心要点

  1. 现有的学习型场景表示方法在处理图像不一致性时效果不佳,导致重建质量下降。
  2. RANRAC通过模糊适应RANSAC方法,能够有效检测并排除不一致的视角,提升图像质量。
  3. 实验表明,RANRAC在多视图重建和单次重建中均显著提高了性能,尤其在处理遮挡和噪声相机姿态时表现优异。

📝 摘要(中文)

基于学习的场景表示方法(如神经辐射场和光场网络)在处理由于遮挡、相机参数估计不准确或镜头光晕等因素导致的图像不一致性时,常常面临挑战。为此,本文提出了一种高效的方法RANdom RAy Consensus(RANRAC),旨在消除不一致数据的影响。与基于鲁棒损失的下权重方法不同,RANRAC能够可靠地检测并排除不一致的视角,从而生成干净的图像。我们对RANSAC范式进行了模糊适应,使其能够应用于大规模模型,并探讨了生成假设的模型和在噪声环境中验证假设的过程。实验结果表明,RANRAC在真实图像的多视图重建和基于光场网络的单次重建中均显著优于现有鲁棒方法。

🔬 方法详解

问题定义:本文旨在解决基于学习的场景表示方法在图像不一致性(如遮挡、相机参数估计不准确等)下的重建问题。现有方法通常依赖于鲁棒损失来降低异常值的影响,但效果有限。

核心思路:RANRAC通过模糊适应RANSAC范式,能够有效检测并排除不一致的视角,确保生成的图像质量更高。该方法的设计灵感来自于经典的RANSAC算法,旨在提高鲁棒性和准确性。

技术框架:RANRAC的整体架构包括数据采集、假设生成、假设验证和模型参数优化四个主要模块。首先,从图像中提取特征,然后生成多个假设,接着在噪声环境中验证这些假设,最后优化模型参数以获得最佳重建效果。

关键创新:RANRAC的主要创新在于其模糊适应的RANSAC方法,能够在大规模模型中有效应用,且通过调整样本数量作为超参数,提升了模型的灵活性和适应性。

关键设计:在参数设置上,RANRAC允许用户调节最小样本数量,以适应不同场景的需求。此外,采用数据驱动模型生成假设,并在噪声环境中进行假设验证,确保了方法的鲁棒性。

📊 实验亮点

实验结果显示,RANRAC在合成和真实场景中的新视图合成任务上,相较于现有最先进的鲁棒方法,性能提升显著。在处理遮挡、噪声相机姿态和失焦视角等不一致性时,RANRAC的重建质量明显优于对比基线,展示了其强大的实用性和有效性。

🎯 应用场景

RANRAC的研究成果可广泛应用于计算机视觉领域,特别是在多视图重建、虚拟现实和增强现实等场景中。通过提高图像重建的鲁棒性和准确性,该方法能够为实际应用提供更高质量的视觉体验,推动相关技术的发展。

📄 摘要(原文)

Learning-based scene representations such as neural radiance fields or light field networks, that rely on fitting a scene model to image observations, commonly encounter challenges in the presence of inconsistencies within the images caused by occlusions, inaccurately estimated camera parameters or effects like lens flare. To address this challenge, we introduce RANdom RAy Consensus (RANRAC), an efficient approach to eliminate the effect of inconsistent data, thereby taking inspiration from classical RANSAC based outlier detection for model fitting. In contrast to the down-weighting of the effect of outliers based on robust loss formulations, our approach reliably detects and excludes inconsistent perspectives, resulting in clean images without floating artifacts. For this purpose, we formulate a fuzzy adaption of the RANSAC paradigm, enabling its application to large scale models. We interpret the minimal number of samples to determine the model parameters as a tunable hyperparameter, investigate the generation of hypotheses with data-driven models, and analyze the validation of hypotheses in noisy environments. We demonstrate the compatibility and potential of our solution for both photo-realistic robust multi-view reconstruction from real-world images based on neural radiance fields and for single-shot reconstruction based on light-field networks. In particular, the results indicate significant improvements compared to state-of-the-art robust methods for novel-view synthesis on both synthetic and captured scenes with various inconsistencies including occlusions, noisy camera pose estimates, and unfocused perspectives. The results further indicate significant improvements for single-shot reconstruction from occluded images. Project Page: https://bennobuschmann.com/ranrac/