Reinforcement Learning for Volt-Var Control: A Novel Two-stage Progressive Training Strategy [article]

Si Zhang, Mingzhi Zhang, Rongxing Hu, David Lubkeman, Yunan Liu, Ning Lu
2021 arXiv   pre-print
This paper develops a reinforcement learning (RL)approach to solve a cooperative, multi-agent Volt-Var Control (VVC) problem for high solar penetration distribution systems. The ingenuity of our RL method lies in a novel two-stage progressive training strategy that can effectively improve training speed and convergence of the machine learning algorithm. In Stage 1(individual training), while holding all the other agents inactive, we separately train each agent to obtain its own optimal VVC
more » ... ns in the action space: consume, generate, do-nothing. In Stage 2 (cooperative training), all agents are trained again coordinatively to share VVC responsibility. Rewards and costs in our RL scheme include (i) a system-level reward (for taking an action), (ii) an agent-level reward (for doing-nothing), and(iii) an agent-level action cost function. This new framework allows rewards to be dynamically allocated to each agent based on their contribution while accounting for the trade-off between control effectiveness and action cost. The proposed methodology is tested and validated in a modified IEEE 123-bus system using realistic PV and load profiles. Simulation results confirm that the proposed approach is robust and computationally efficient; and it achieves desirable volt-var control performance under a wide range of operation conditions.
arXiv:2111.11987v1 fatcat:rluyf4qmsjhulmqhayuoi43xau