From Pixels to Primitives: Scene Change Detection in 3D Gaussian Splatting

📄 arXiv: 2605.07203v1 📥 PDF

作者: Chamuditha Jayanga Galappaththige, Jason Lai, Timothy Patten, Donald Dansereau, Niko Suenderhauf, Dimity Miller

分类: cs.CV

发布日期: 2026-05-08

备注: Project Page: https://chumsy0725.github.io/GS-DIFF/


💡 一句话要点

提出GD-DIFF方法:通过直接分析3D高斯基元属性实现场景变化检测

🎯 匹配领域: 支柱三:空间感知与语义 (Perception & Semantics)

关键词: 3D高斯溅射 场景变化检测 计算机视觉 三维重建 多视图一致性 基元分析

📋 核心要点

  1. 现有方法依赖“渲染后比较”范式,不仅计算开销大,且难以保证多视图一致性,容易受渲染伪影干扰。
  2. 提出GD-DIFF方法,直接在3D高斯基元空间进行对比,利用基元属性表征变化,并引入漂移模型与可观测性项。
  3. 实验表明,该方法在无需监督的情况下能区分结构与外观变化,在真实场景中mIoU指标较SOTA提升约17%。

📝 摘要(中文)

基于高斯溅射(Gaussian Splatting)的场景变化检测方法通常遵循“先渲染后比较”的范式,即通过将变化前后的场景渲染为2D图像,再利用像素或特征残差进行对比。本文将变化检测问题从像素空间转移到基元(Primitive)空间,证明了高斯基元固有的位置、各向异性协方差和颜色属性足以表征场景变化。由于高斯溅射的优化过程存在欠约束性,导致即使场景未变,独立优化的基元在数量、位置和形状上也存在差异。为此,本文提出了GD-DIFF方法,通过引入几何与光度漂移的各向异性模型,以及基于基元可观测性的约束项,有效解决了基元空间对比的难题。该方法实现了多视图一致的变化检测,并能区分结构性与表面性变化,在真实世界基准测试中,其平均交并比(mIoU)较现有最优方法提升了约17%。

🔬 方法详解

问题定义:现有基于高斯溅射的变化检测方法受限于“渲染后比较”范式,将3D场景坍缩为2D像素进行对比,忽略了3D空间中基元本身的语义信息,且难以处理独立优化带来的基元分布差异。

核心思路:将变化检测视为基元空间的属性对比问题。核心挑战在于高斯溅射的非唯一性(欠约束),即同一场景的不同优化结果会导致基元参数不一致。论文通过建模几何与光度漂移,并引入可观测性权重来对齐基元,从而实现鲁棒的差异检测。

技术框架:GD-DIFF首先分别对变化前后的场景进行高斯溅射建模,随后通过各向异性漂移模型对齐基元属性,计算基元级别的差异得分,最后根据得分区分结构性变化(几何位置/形状)与表面性变化(颜色/光照)。

关键创新:与传统方法不同,GD-DIFF无需额外的多视图一致性损失函数,通过直接操作3D基元,天然保证了变化检测结果在多视角下的几何一致性,并能实现对变化类型的细粒度分类。

关键设计:引入了“可观测性项”(Observability Term),该项基于相机几何约束评估每个高斯基元的可信度,从而过滤掉因优化噪声导致的伪变化;同时利用各向异性协方差矩阵来精确度量几何漂移,确保对比过程对空间分布变化敏感。

🖼️ 关键图片

fig_0
fig_1
fig_2

📊 实验亮点

GD-DIFF在真实世界数据集上的表现显著优于现有SOTA方法,平均交并比(mIoU)提升了约17%。该方法不仅在检测精度上具有优势,更重要的是其具备无需监督即可区分结构性变化(如物体移动)与表面性变化(如颜色改变)的能力,且天然具备多视图一致性,无需额外的训练开销。

🎯 应用场景

该技术在机器人导航、自动驾驶、城市数字孪生及工业巡检领域具有重要价值。它能够实时监测环境中的结构性变化(如障碍物新增)与外观变化(如光照改变),为移动机器人的动态环境感知、长期定位与地图更新提供高精度的语义级变化信息,显著提升复杂场景下的环境适应能力。

📄 摘要(原文)

Scene change detection methods built on Gaussian splatting universally follow a render-then-compare paradigm: the pre-change scene is rendered into 2D and compared against post-change images via pixel or feature residuals. This change detection problem with Gaussian Splatting has been treated as a question about pixels; we treat it as a question about primitives. We provide direct evidence that native primitive attributes alone -- position, anisotropic covariance, and color -- carry sufficient signal for scene change detection. What makes primitive-space comparison hard is the under-constrained nature of Gaussian splatting representation: independent optimizations yield primitive solutions whose count, positions, shapes, and colors differ even where nothing has changed. We address this challenge with anisotropic models of geometric and photometric drift, complemented by a per-primitive observability term that reflects the extent to which each Gaussian is constrained by the camera geometry. Operating directly on primitives gives our method, GD-DIFF, two properties that distinguish it from render-then-compare methods. First, change maps are multi-view consistent by construction, where prior work had to learn this through an additional optimization objective. Second, geometric and appearance changes are scored separately, identifying not just where but what kind of change occurred, distinguishing structural changes (e.g., an added object) from surface-level ones (e.g., a color change) without supervision or external model dependencies. On real-world benchmarks, GS-DIFF surpasses the prior state-of-the-art approach by approximatelt 17% in mean Intersection over Union.