SurgicalMamba: Dual-Path SSD with State Regramming for Online Surgical Phase Recognition

作者: Sukju Oh, Sukkyu Sun

分类: cs.CV, cs.AI

发布日期: 2026-05-14

备注: 28 pages, 7 figures, 10 tables; Code available at https://github.com/sukjuoh/Surgical-Mamba

🔗 代码/项目: GITHUB

💡 一句话要点

SurgicalMamba：基于状态重编程的双路径SSD用于在线手术阶段识别

🎯 匹配领域: 支柱二：RL算法与架构 (RL & Architecture)

关键词: 手术阶段识别 在线识别 Mamba 状态空间模型 时间序列分析 医疗视频分析 深度学习

📋 核心要点

在线手术阶段识别面临手术视频长序列、时间流逝不均匀和视觉特征强相关等挑战，现有方法难以兼顾。
SurgicalMamba通过双路径SSD、强度调制步进和状态重编程，分别应对长短期依赖、时间扭曲和跨通道混合问题。
实验表明，SurgicalMamba在多个SPR基准上取得了SOTA结果，并在单GPU上实现了119fps的推理速度。

📝 摘要（中文）

本文提出SurgicalMamba，一个基于Mamba2的结构化状态空间对偶(SSD)的因果SPR模型，旨在解决在线手术阶段识别(SPR)问题。该模型在每个帧上仅根据过去的上下文进行预测，适用于手术视频中程序跨越数万帧、时间流逝不均匀以及视觉领域狭窄等挑战。SurgicalMamba引入了三个SSD兼容组件：双路径SSD块，分离长短期状态；强度调制步进，自适应调整慢路径的有效速率；状态重编程，通过Cayley旋转实现跨通道混合。在七个公开SPR基准测试中，SurgicalMamba达到了最先进的精度和阶段级Jaccard指标，例如在Cholec80上达到94.6%/82.7%，在AutoLaparo上达到89.5%/68.9%。

🔬 方法详解

问题定义：在线手术阶段识别（SPR）需要在每个视频帧上，仅根据之前的视频内容预测当前手术所处的阶段。现有方法要么随着视频长度增加计算成本，要么以固定的速率更新状态，无法有效处理手术视频中时间流逝不均匀（长时间的例行操作穿插着短暂的关键阶段转换）以及视觉特征通道间强相关的问题。

核心思路：SurgicalMamba的核心思路是利用Mamba2的结构化状态空间对偶（SSD）特性，构建一个高效且能捕捉手术视频时序特征的SPR模型。通过引入双路径结构、强度调制步进和状态重编程，分别解决长短期依赖、时间扭曲和跨通道混合的问题。

技术框架：SurgicalMamba的整体架构是一个因果模型，它接收手术视频帧作为输入，并输出当前帧所属的手术阶段。该模型主要包含以下几个关键模块：1) 双路径SSD块：分离长短期状态，分别处理不同时间尺度的信息。2) 强度调制步进：根据阶段相关信息自适应调整慢路径的有效速率，实现时间扭曲。3) 状态重编程：通过Cayley旋转实现跨通道混合，增强特征表达能力。

关键创新：SurgicalMamba的关键创新在于三个SSD兼容组件的引入：双路径SSD块、强度调制步进和状态重编程。双路径SSD块允许模型同时处理长短期依赖关系，强度调制步进能够根据视频内容动态调整时间步长，状态重编程则通过可学习的旋转矩阵实现跨通道的信息融合。这些创新使得SurgicalMamba能够更好地适应手术视频的特点。

关键设计：双路径SSD块包含快慢两条路径，分别处理短期和长期依赖。强度调制步进使用连续时间时间扭曲，根据阶段相关信息调整慢路径的速率。状态重编程使用Cayley旋转，通过学习旋转平面实现跨通道混合。模型使用交叉熵损失函数进行训练，并在多个公开数据集上进行评估。

🖼️ 关键图片

📊 实验亮点

SurgicalMamba在七个公开SPR基准测试中取得了最先进的结果。在Cholec80数据集上，精度达到94.6%，Jaccard指数达到82.7%，分别比之前的最佳方法提高了0.7个百分点和2.2个百分点。在AutoLaparo数据集上，精度达到89.5%，Jaccard指数达到68.9%，分别提高了1.7个百分点和2.0个百分点。此外，SurgicalMamba在单个GPU上实现了119fps的推理速度。

🎯 应用场景

SurgicalMamba可应用于智能手术室系统，为医生提供实时的手术阶段识别和上下文感知辅助。该技术有助于提高手术效率、减少人为错误，并为术后分析和培训提供数据支持。未来，该模型可扩展到其他医疗视频分析任务，例如器械识别、异常事件检测等。

📄 摘要（原文）

Online surgical phase recognition (SPR) underpins context-aware operating-room systems and requires committing to a prediction at every frame from past context alone. Surgical video poses three demands that natural-video recognizers do not jointly address: procedures span tens of thousands of frames, time flows non-uniformly as long routine stretches are punctuated by brief phase-defining transitions, and the visual domain is narrow so backbone features are strongly correlated across channels. Existing recognizers either let per-frame cost grow with elapsed length, or hold cost bounded but advance state at a uniform rate with channel-independent dynamics, leaving the latter two demands unaddressed. We present SurgicalMamba, a causal SPR model built on Mamba2's structured state-space duality (SSD) that holds per-frame cost at O(d). It introduces three SSD-compatible components, each targeting one demand: a dual-path SSD block that separates long- and short-term regimes at the level of recurrent state; intensity-modulated stepping, a continuous-time time-warp that adapts the slow path's effective rate to phase-relevant information; and state regramming, a per-chunk Cayley rotation that opens cross-channel mixing in the otherwise axis-aligned SSM recurrence. The learned rotation planes inherit a phase-aligned structure without any direct supervision, offering an interpretable internal signature of surgical workflow. Across seven public SPR benchmarks, SurgicalMamba reaches state-of-the-art accuracy and phase-level Jaccard under strict online evaluation: 94.6%/82.7% on Cholec80 (+0.7 pp/+2.2 pp over the strongest prior) and 89.5%/68.9% on AutoLaparo (+1.7 pp/+2.0 pp), at 119 fps on a single GPU. Ablations isolate the contribution of each component. The code is publicly available at https://github.com/sukjuoh/Surgical-Mamba.

SurgicalMamba: Dual-Path SSD with State Regramming for Online Surgical Phase Recognition

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理