论文标题
用于动态分配网络重新配置的批处理限制的强化学习
Batch-Constrained Reinforcement Learning for Dynamic Distribution Network Reconfiguration
论文作者
论文摘要
动态分布网络重新配置(DNR)算法执行可远程控制开关的小时状态更改,以提高分布系统性能。该问题通常通过基于物理模型的控制算法解决,该算法不仅依赖于准确的网络参数,而且缺乏可扩展性。为了解决这些局限性,本文为动态DNR问题开发了一个数据驱动的批处理增强学习(RL)算法。提出的RL算法从有限的历史操作数据集中学习网络重新配置控制策略,而无需与分发网络进行交互。在三个分销网络上的数值研究结果表明,所提出的算法不仅胜过最先进的RL算法,而且还改善了产生历史运营数据的行为控制策略。所提出的算法也非常可扩展,可以实时找到理想的网络重新配置解决方案。
Dynamic distribution network reconfiguration (DNR) algorithms perform hourly status changes of remotely controllable switches to improve distribution system performance. The problem is typically solved by physical model-based control algorithms, which not only rely on accurate network parameters but also lack scalability. To address these limitations, this paper develops a data-driven batch-constrained reinforcement learning (RL) algorithm for the dynamic DNR problem. The proposed RL algorithm learns the network reconfiguration control policy from a finite historical operational dataset without interacting with the distribution network. The numerical study results on three distribution networks show that the proposed algorithm not only outperforms state-of-the-art RL algorithms but also improves the behavior control policy, which generated the historical operational data. The proposed algorithm is also very scalable and can find a desirable network reconfiguration solution in real-time.