Power Smoothing Control for Wind-Storage Integrated Systems With Hierarchical Safe Reinforcement Learning and Curriculum Learning

Shuyi Wang, Huan Zhao, Ting Shu, Zibin Pan, Gaoqi Liang, Junhua Zhao

Published: 2025, Last Modified: 08 Jan 2026IEEE Trans. Smart Grid 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: As the penetration of wind energy increases, the Wind Storage Integrated System (WSIS) has become a critical solution to ensure stable wind power output and maximize economic benefits. However, existing data-based power smoothing control strategies are hard to satisfy the power fluctuation constraint in a complex environment, presenting an inefficient coordinate performance. To address this problem, this paper proposes a novel Hierarchical Safe Deep Reinforcement Learning (HSDRL) control framework for WSIS. The control problem is first reformulated as two interconnected Constrained Markov Decision Processes, and the hierarchical primal-dual-based safe Deep Deterministic Policy Gradient algorithm is proposed to learn the optimal policy that ensures the power output constraint. Furthermore, the curriculum learning is designed and Constraint Violation Prioritized Experience Replay method is proposed to address the unstable convergence issues caused by imbalanced constraint violation and constraint satisfaction experience data. Last, a hierarchical shared feature neural network structure is designed to share the parameters of Q networks at hierarchies and increase learning efficiency. Simulation results in WindFarmSimulator validate the efficacy of the proposed control framework, demonstrating a 15.3% improvement in profit and a 46.0% reduction in fluctuation compared to existing methods.

External IDs:dblp:journals/tsg/WangZSPLZ25