用于动态分配网络重新配置的批处理限制的强化学习

论文标题

用于动态分配网络重新配置的批处理限制的强化学习

Batch-Constrained Reinforcement Learning for Dynamic Distribution Network Reconfiguration

论文作者

Gao, Yuanqi, Wang, Wei, Shi, Jie, Yu, Nanpeng

论文摘要

动态分布网络重新配置（DNR）算法执行可远程控制开关的小时状态更改，以提高分布系统性能。该问题通常通过基于物理模型的控制算法解决，该算法不仅依赖于准确的网络参数，而且缺乏可扩展性。为了解决这些局限性，本文为动态DNR问题开发了一个数据驱动的批处理增强学习（RL）算法。提出的RL算法从有限的历史操作数据集中学习网络重新配置控制策略，而无需与分发网络进行交互。在三个分销网络上的数值研究结果表明，所提出的算法不仅胜过最先进的RL算法，而且还改善了产生历史运营数据的行为控制策略。所提出的算法也非常可扩展，可以实时找到理想的网络重新配置解决方案。

Dynamic distribution network reconfiguration (DNR) algorithms perform hourly status changes of remotely controllable switches to improve distribution system performance. The problem is typically solved by physical model-based control algorithms, which not only rely on accurate network parameters but also lack scalability. To address these limitations, this paper develops a data-driven batch-constrained reinforcement learning (RL) algorithm for the dynamic DNR problem. The proposed RL algorithm learns the network reconfiguration control policy from a finite historical operational dataset without interacting with the distribution network. The numerical study results on three distribution networks show that the proposed algorithm not only outperforms state-of-the-art RL algorithms but also improves the behavior control policy, which generated the historical operational data. The proposed algorithm is also very scalable and can find a desirable network reconfiguration solution in real-time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题