RLPlanner: Reinforcement Learning based Floorplanning for Chiplets with Fast Thermal Analysis

作者: Yuanyuan Duan, Xingchen Liu, Zhiping Yu, Hanming Wu, Leilai Shao, Xiaolei Zhu

分类: cs.LG, cs.AR

发布日期: 2023-12-28 (更新: 2024-01-16)

💡 一句话要点

RLPlanner：基于强化学习的Chiplet Floorplanning，加速热分析

🎯 匹配领域: 支柱二：RL算法与架构 (RL & Architecture)

关键词: Chiplet Floorplanning 强化学习 热分析 快速评估

📋 核心要点

Chiplet系统设计中，传统floorplanning方法难以兼顾线长、互连延迟和热限制等多个优化目标。
RLPlanner利用强化学习算法，结合快速热评估方法，能够联合优化线长和温度，实现更高效的chiplet floorplanning。
实验表明，RLPlanner的快速热评估方法比HotSpot快120倍，且集成后能使总体优化目标提升20.28%。

📝 摘要（中文）

本文提出了一种名为RLPlanner的chiplet系统早期floorplanning工具，该工具结合了先进的强化学习算法和快速热评估方法，旨在同时最小化总线长和温度。随着chiplet系统复杂性和紧凑性的增加，微凸点分配、互连延迟和热限制在floorplanning阶段变得至关重要。为了加速迭代和优化过程，RLPlanner集成了我们开发的快速热评估方法，显著减少了耗时的热计算。实验结果表明，与开源热求解器HotSpot相比，我们的快速热评估方法实现了0.25 K的平均绝对误差（MAE），并提供了超过120倍的加速。在与快速热评估方法集成后，与使用HotSpot的经典模拟退火方法相比，RLPlanner在相似的运行时间内，在最小化目标（线长和温度的组合）方面平均提高了20.28%。

🔬 方法详解

问题定义：Chiplet floorplanning旨在确定各个chiplet在芯片上的最佳位置，以最小化线长、互连延迟和温度。传统方法，如模拟退火，计算复杂度高，难以在早期设计阶段快速评估和优化热效应。现有热分析工具（如HotSpot）计算耗时，限制了floorplanning的迭代速度。

核心思路：RLPlanner的核心思路是利用强化学习来探索chiplet的布局空间，并结合快速热评估方法来加速评估过程。强化学习能够学习到最优的floorplanning策略，而快速热评估方法则能够快速提供布局的热性能反馈，从而指导强化学习的探索方向。通过二者的结合，RLPlanner能够在较短的时间内找到一个较优的floorplanning方案。

技术框架：RLPlanner的整体框架包括以下几个主要模块：1) 强化学习Agent：负责探索chiplet的布局空间，并根据环境的反馈更新策略。2) 快速热评估模块：用于快速评估当前布局的热性能。3) 环境：模拟chiplet floorplanning的过程，并提供奖励信号给强化学习Agent。整个流程如下：Agent根据当前状态选择一个动作（即chiplet的布局方案），然后环境根据该动作更新状态，并利用快速热评估模块计算温度，最终计算奖励信号反馈给Agent。Agent根据奖励信号更新策略，并重复以上过程，直到找到一个较优的布局方案。

关键创新：RLPlanner的关键创新在于提出了一个快速热评估方法，该方法能够显著减少热计算的时间，从而加速floorplanning的迭代过程。此外，RLPlanner还利用强化学习算法来探索chiplet的布局空间，从而能够找到一个较优的布局方案。与传统的模拟退火方法相比，RLPlanner能够更好地平衡线长和温度等多个优化目标。

关键设计：RLPlanner的强化学习Agent采用Actor-Critic架构，其中Actor负责选择动作，Critic负责评估动作的价值。奖励函数的设计至关重要，它需要综合考虑线长和温度等多个因素。快速热评估方法基于简化的热模型，通过预先计算的热阻网络来快速评估温度。具体参数设置和网络结构等细节在论文中未详细说明，属于未知信息。

📊 实验亮点

实验结果表明，RLPlanner的快速热评估方法实现了0.25 K的平均绝对误差（MAE），并提供了超过120倍的加速，显著优于开源热求解器HotSpot。与使用HotSpot的经典模拟退火方法相比，RLPlanner在相似的运行时间内，在最小化目标（线长和温度的组合）方面平均提高了20.28%。这些结果验证了RLPlanner在chiplet floorplanning方面的有效性和优越性。

🎯 应用场景

RLPlanner可应用于各种chiplet系统的早期floorplanning设计，帮助工程师快速评估和优化芯片布局，降低设计成本，提高芯片性能和可靠性。该工具能够显著缩短设计周期，并为高性能计算、人工智能和移动设备等领域提供更高效的芯片解决方案。未来，该方法可以扩展到三维芯片设计和更复杂的散热结构优化。

📄 摘要（原文）

Chiplet-based systems have gained significant attention in recent years due to their low cost and competitive performance. As the complexity and compactness of a chiplet-based system increase, careful consideration must be given to microbump assignments, interconnect delays, and thermal limitations during the floorplanning stage. This paper introduces RLPlanner, an efficient early-stage floorplanning tool for chiplet-based systems with a novel fast thermal evaluation method. RLPlanner employs advanced reinforcement learning to jointly minimize total wirelength and temperature. To alleviate the time-consuming thermal calculations, RLPlanner incorporates the developed fast thermal evaluation method to expedite the iterations and optimizations. Comprehensive experiments demonstrate that our proposed fast thermal evaluation method achieves a mean absolute error (MAE) of 0.25 K and delivers over 120x speed-up compared to the open-source thermal solver HotSpot. When integrated with our fast thermal evaluation method, RLPlanner achieves an average improvement of 20.28\% in minimizing the target objective (a combination of wirelength and temperature), within a similar running time, compared to the classic simulated annealing method with HotSpot.

RLPlanner: Reinforcement Learning based Floorplanning for Chiplets with Fast Thermal Analysis

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册