A New Segment Routing method with Swap Node Selection Strategy Based on Deep Reinforcement Learning for Software Defined Network

作者: Miao Ye, Jihao Zheng, Qiuxiang Jiang, Yuan Huang, Ziheng Wang, Yong Wang

分类: cs.AI

发布日期: 2025-03-21

💡 一句话要点

提出基于深度强化学习的SDN分段路由方法，优化流表下发时间并提升网络性能。

🎯 匹配领域: 支柱二：RL算法与架构 (RL & Architecture)

关键词: 软件定义网络 分段路由 深度强化学习 流表下发 网络优化

📋 核心要点

现有分段路由方法存在路由变化时需要重新分段，以及忽略流表下发时间的问题，影响网络性能。
论文提出基于深度强化学习的智能分段路由算法，同时优化路由策略和路径分段策略，减少流表下发时间。
实验结果表明，该方法在优化吞吐量、延迟和丢包率等性能指标的同时，显著减少了分段路由建立的时间开销。

📝 摘要（中文）

本文针对现有分段路由(SR)方法需先确定路由再分段选择交换节点，以及路由变化时需重新分段的问题，同时考虑到流表下发时间，提出了一种新的优化模型。该模型能够同时形成路由策略和路径分段策略，选择合适的交换节点以减少流表下发时间。此外，本文设计了一种基于深度强化学习(DRL-SR)的智能分段路由算法来求解该模型。该算法设计了包含多种QoS性能指标、流表下发时间开销和SR标签栈深度的流量矩阵作为深度强化学习代理的状态空间。同时，设计了动作选择策略和相应的奖励函数，代理在选择下一个节点时考虑路由，并考虑控制器向交换节点下发流表的时间成本因素。实验结果表明，与现有方法相比，所设计的优化模型和智能算法(DRL-SR)能够在优化吞吐量、延迟和丢包率等性能指标的同时，减少完成分段路由建立任务所需的时间开销。

🔬 方法详解

问题定义：现有分段路由方法通常先确定路由，然后进行路径分段，选择交换节点。这种方法在路由发生变化时需要重新进行分段，效率较低。此外，现有方法通常忽略了控制器向交换节点下发流表所需的时间开销，这也会影响整体的网络性能。因此，需要一种能够同时优化路由和分段策略，并考虑流表下发时间的方法。

核心思路：本文的核心思路是利用深度强化学习（DRL）来智能地选择路由和交换节点，从而最小化流表下发时间，并优化网络性能指标（如吞吐量、延迟和丢包率）。通过将网络状态（包括流量矩阵、QoS指标等）作为DRL代理的状态，将选择下一个节点和是否将其作为交换节点作为动作，并设计合适的奖励函数，DRL代理可以学习到最优的路由和分段策略。

技术框架：该方法的技术框架主要包括以下几个部分：1) 建立优化模型，该模型同时考虑路由策略和路径分段策略，目标是最小化流表下发时间。2) 设计深度强化学习代理，包括状态空间、动作空间和奖励函数。状态空间包括流量矩阵、QoS性能指标、流表下发时间开销和SR标签栈深度。动作空间包括选择下一个节点和是否将该节点作为交换节点。奖励函数的设计考虑了网络性能指标和流表下发时间。3) 使用深度强化学习算法训练代理，使其能够学习到最优的路由和分段策略。

关键创新：该方法最重要的技术创新点在于同时优化路由和分段策略，并考虑了流表下发时间。与现有方法相比，该方法能够更有效地利用网络资源，并减少分段路由建立的时间开销。此外，使用深度强化学习算法能够自适应地学习到最优策略，无需人工干预。

关键设计：在状态空间设计中，流量矩阵包含了多个QoS性能指标，流表下发时间开销和SR标签栈深度，从而能够全面地反映网络状态。在动作空间设计中，同时考虑了选择下一个节点和是否将其作为交换节点，从而能够同时优化路由和分段策略。奖励函数的设计综合考虑了网络性能指标和流表下发时间，从而能够引导代理学习到最优策略。具体网络结构未知。

📊 实验亮点

实验结果表明，与现有方法相比，该方法在优化吞吐量、延迟和丢包率等性能指标的同时，显著减少了完成分段路由建立任务所需的时间开销。具体提升幅度未知，但摘要强调了其优越性。

🎯 应用场景

该研究成果可应用于软件定义网络(SDN)中，用于优化网络路由和分段策略，提高网络性能和资源利用率。特别是在大规模、高动态的网络环境中，该方法能够自适应地调整路由策略，减少流表下发时间，从而提高网络的稳定性和可靠性。未来可应用于智能交通、云计算、物联网等领域。

📄 摘要（原文）

The existing segment routing (SR) methods need to determine the routing first and then use path segmentation approaches to select swap nodes to form a segment routing path (SRP). They require re-segmentation of the path when the routing changes. Furthermore, they do not consider the flow table issuance time, which cannot maximize the speed of issuance flow table. To address these issues, this paper establishes an optimization model that can simultaneously form routing strategies and path segmentation strategies for selecting the appropriate swap nodes to reduce flow table issuance time. It also designs an intelligent segment routing algorithm based on deep reinforcement learning (DRL-SR) to solve the proposed model. First, a traffic matrix is designed as the state space for the deep reinforcement learning agent; this matrix includes multiple QoS performance indicators, flow table issuance time overhead and SR label stack depth. Second, the action selection strategy and corresponding reward function are designed, where the agent selects the next node considering the routing; in addition, the action selection strategy whether the newly added node is selected as the swap node and the corresponding reward function are designed considering the time cost factor for the controller to issue the flow table to the swap node. Finally, a series of experiments and their results show that, compared with the existing methods, the designed segmented route optimization model and the intelligent solution algorithm (DRL-SR) can reduce the time overhead required to complete the segmented route establishment task while optimizing performance metrics such as throughput, delays and packet losses.

A New Segment Routing method with Swap Node Selection Strategy Based on Deep Reinforcement Learning for Software Defined Network

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理