Sub-THZ网络中MEC辅助无人机的联合轨迹和资源优化：一种基于资源的多代理近端策略优化DRL具有注意机制

论文标题

Sub-THZ网络中MEC辅助无人机的联合轨迹和资源优化：一种基于资源的多代理近端策略优化DRL具有注意机制

Joint Trajectory and Resource Optimization of MEC-Assisted UAVs in Sub-THz Networks: A Resources-based Multi-Agent Proximal Policy Optimization DRL with Attention Mechanism

论文作者

Park, Yu Min, Hassan, Sheikh Salman, Tun, Yan Kyaw, Han, Zhu, Hong, Choong Seon

论文摘要

THZ频段通信技术将在6G网络中使用，以实现高速和高容量数据服务需求。然而，由于局限性，即分子吸收，降雨衰减和覆盖范围，引起了THZ通信损失。此外，为了保持稳定的THZ通信并克服农村和郊区地区的覆盖距离，所需的BSS数量非常高。因此，需要一个新的通信平台，以实现航空通信服务。此外，空降平台支持LOS通信而不是NLOS通信，这有助于克服这些损失。因此，在这项工作中，我们研究了启用MEC的无人机的部署和资源优化，这些无人机可以在远程区域提供基于THZ的通信。为此，我们制定了一个优化问题，以最大程度地减少MEC-UAV和MUS的能耗之和，以及在给定任务信息下MUS产生的延迟。公式的问题是MINLP问题，它是NP-HARD。我们将主要问题分解为两个子问题，以解决该法式问题。我们通过标准优化求解器（即Cvxpy）解决了第一个子问题，这是由于其凸性的。为了解决第二个子问题，我们设计了带有注意机制的RMAPPO DRL算法。所考虑的注意机制用于编码各种观测值。这是由网络协调员设计的，以向网络中的每个代理提供差异化的拟合奖励。仿真结果表明，所提出的算法的表现优于基准，并产生一个网络公用事业，价格为$ 2.22 \％$，$ 15.55 \％$ $，$ 17.77 \％\％\％$ $ $ $ $ $ $ $ $ $。

THz band communication technology will be used in the 6G networks to enable high-speed and high-capacity data service demands. However, THz-communication losses arise owing to limitations, i.e., molecular absorption, rain attenuation, and coverage range. Furthermore, to maintain steady THz-communications and overcome coverage distances in rural and suburban regions, the required number of BSs is very high. Consequently, a new communication platform that enables aerial communication services is required. Furthermore, the airborne platform supports LoS communications rather than NLoS communications, which helps overcome these losses. Therefore, in this work, we investigate the deployment and resource optimization for MEC-enabled UAVs, which can provide THz-based communications in remote regions. To this end, we formulate an optimization problem to minimize the sum of the energy consumption of both MEC-UAV and MUs and the delay incurred by MUs under the given task information. The formulated problem is a MINLP problem, which is NP-hard. We decompose the main problem into two subproblems to address the formulated problem. We solve the first subproblem with a standard optimization solver, i.e., CVXPY, due to its convex nature. To solve the second subproblem, we design a RMAPPO DRL algorithm with an attention mechanism. The considered attention mechanism is utilized for encoding a diverse number of observations. This is designed by the network coordinator to provide a differentiated fit reward to each agent in the network. The simulation results show that the proposed algorithm outperforms the benchmark and yields a network utility which is $2.22\%$, $15.55\%$, and $17.77\%$ more than the benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题