论文标题
用于节能计算卸载和资源分配的边缘智能超出5G
Edge Intelligence for Energy-efficient Computation Offloading and Resource Allocation in 5G Beyond
论文作者
论文摘要
5G Beyond是一个最终边缘云策划的网络,可以利用端设备,边缘服务器和云的异质功能,因此有可能通过计算卸载来实现计算密集型和延迟敏感的应用程序。但是,在多用户无线网络中,不同的应用程序要求以及设备之间进行通信的各种无线电访问模式的可能性使设计最佳计算卸载方案的挑战。此外,可以访问包括无线通道状态以及可用带宽和计算资源等变量的完整网络信息是一个主要问题。深入增强学习(DRL)是一种新兴技术,可通过有限且准确的网络信息来解决此类问题。在本文中,我们利用DRL来设计最佳的计算卸载和资源分配策略,以最大程度地减少系统能耗。我们首先提出一个多用户端边缘云策划的网络,所有设备和基站都具有计算功能。然后,我们将联合计算卸载和资源分配问题作为马尔可夫决策过程(MDP),并提出了一种新的DRL算法,以最大程度地减少系统能耗。基于现实世界数据集的数值结果表明,所提出的基于DRL的算法在系统能源消耗方面显着优于基准策略。广泛的模拟表明,学习率,折现因子和设备数量对拟议算法的性能有很大影响。
5G beyond is an end-edge-cloud orchestrated network that can exploit heterogeneous capabilities of the end devices, edge servers, and the cloud and thus has the potential to enable computation-intensive and delay-sensitive applications via computation offloading. However, in multi user wireless networks, diverse application requirements and the possibility of various radio access modes for communication among devices make it challenging to design an optimal computation offloading scheme. In addition, having access to complete network information that includes variables such as wireless channel state, and available bandwidth and computation resources, is a major issue. Deep Reinforcement Learning (DRL) is an emerging technique to address such an issue with limited and less accurate network information. In this paper, we utilize DRL to design an optimal computation offloading and resource allocation strategy for minimizing system energy consumption. We first present a multi-user end-edge-cloud orchestrated network where all devices and base stations have computation capabilities. Then, we formulate the joint computation offloading and resource allocation problem as a Markov Decision Process (MDP) and propose a new DRL algorithm to minimize system energy consumption. Numerical results based on a real-world dataset demonstrate that the proposed DRL-based algorithm significantly outperforms the benchmark policies in terms of system energy consumption. Extensive simulations show that learning rate, discount factor, and number of devices have considerable influence on the performance of the proposed algorithm.