Integrated Sensing and Communications for Low-Altitude Economy: A Deep Reinforcement Learning Approach

📄 arXiv: 2412.04074v3 📥 PDF

作者: Xiaowen Ye, Yuyi Mao, Xianghao Yu, Shu Sun, Liqun Fu, Jie Xu

分类: cs.NI, cs.LG

发布日期: 2024-12-05 (更新: 2025-01-02)

备注: submitted for an IEEE publication


💡 一句话要点

提出深度强化学习方法以优化低空经济中的集成感知与通信系统

🎯 匹配领域: 支柱二:RL算法与架构 (RL & Architecture)

关键词: 集成感知与通信 深度强化学习 无人机 低空经济 马尔可夫决策过程 优化算法 通信系统 轨迹优化

📋 核心要点

  1. 现有方法在低空经济中难以同时满足通信速率和感知精度的约束,导致效率低下。
  2. 论文提出了一种基于深度强化学习的集成感知与通信方案,优化了波束形成和无人机轨迹。
  3. 实验结果显示,DeepLSC在通信总速率上优于基线方法,且收敛速度更快,鲁棒性更强。

📝 摘要(中文)

本文研究了一种集成感知与通信(ISAC)系统,旨在为低空经济(LAE)提供服务。该系统由地面基站(GBS)为授权无人机(UAV)提供通信和导航服务,同时监测低空空域以监控未经授权的移动目标。通过联合优化GBS的波束形成和UAV的轨迹,最大化给定飞行周期内的通信总速率,同时满足感知的信噪比要求、飞行任务和避免碰撞的约束。为此,论文将问题转化为特定的马尔可夫决策过程(MDP)模型,并提出了一种新的LAE导向的ISAC方案,称为深度LAE-ISAC(DeepLSC),利用深度强化学习技术进行优化。仿真结果表明,DeepLSC在满足约束条件的同时,通信总速率更高,收敛速度更快,且对不同设置更具鲁棒性。

🔬 方法详解

问题定义:本文旨在解决低空经济中集成感知与通信系统的优化问题,现有方法在满足通信和感知的约束条件时效率较低,难以实现最佳性能。

核心思路:论文通过将问题建模为特定的马尔可夫决策过程(MDP),并利用深度强化学习技术,提出了一种新的优化方案DeepLSC,旨在同时提高通信速率和感知能力。

技术框架:整体架构包括地面基站(GBS)和无人机(UAV)之间的波束形成与轨迹优化,采用分层经验回放机制和对称经验增强机制以提高学习效率和收敛速度。

关键创新:最重要的创新在于设计了约束噪声探索策略和分层经验回放机制,使得在复杂约束下的学习过程更加高效,显著提升了系统的性能。

关键设计:在参数设置上,设计了适应性奖励函数以满足多种约束,网络结构采用深度神经网络进行训练,并通过对称经验增强机制丰富经验集以加速收敛。

🖼️ 关键图片

fig_0
fig_1
fig_2

📊 实验亮点

实验结果表明,DeepLSC在通信总速率上比基线方法提高了显著的性能,且在不同设置下表现出更强的鲁棒性,收敛速度也明显加快,展示了该方法的有效性和实用性。

🎯 应用场景

该研究的潜在应用领域包括无人机监控、低空交通管理和智能城市基础设施等。通过优化通信和感知的集成,能够有效提升无人机在复杂环境中的操作效率,具有重要的实际价值和未来影响。

📄 摘要(原文)

This paper studies an integrated sensing and communications (ISAC) system for low-altitude economy (LAE), where a ground base station (GBS) provides communication and navigation services for authorized unmanned aerial vehicles (UAVs), while sensing the low-altitude airspace to monitor the unauthorized mobile target. The expected communication sum-rate over a given flight period is maximized by jointly optimizing the beamforming at the GBS and UAVs' trajectories, subject to the constraints on the average signal-to-noise ratio requirement for sensing, the flight mission and collision avoidance of UAVs, as well as the maximum transmit power at the GBS. Typically, this is a sequential decision-making problem with the given flight mission. Thus, we transform it to a specific Markov decision process (MDP) model called episode task. Based on this modeling, we propose a novel LAE-oriented ISAC scheme, referred to as Deep LAE-ISAC (DeepLSC), by leveraging the deep reinforcement learning (DRL) technique. In DeepLSC, a reward function and a new action selection policy termed constrained noise-exploration policy are judiciously designed to fulfill various constraints. To enable efficient learning in episode tasks, we develop a hierarchical experience replay mechanism, where the gist is to employ all experiences generated within each episode to jointly train the neural network. Besides, to enhance the convergence speed of DeepLSC, a symmetric experience augmentation mechanism, which simultaneously permutes the indexes of all variables to enrich available experience sets, is proposed. Simulation results demonstrate that compared with benchmarks, DeepLSC yields a higher sum-rate while meeting the preset constraints, achieves faster convergence, and is more robust against different settings.