Abstract: Highlights•A new hybrid reinforcement learning for GBWH with extensive practical features.•New separate neural networks embedding in hybrid policy achieve higher utility.•Student’s-T<math><mi is="true">T</mi></math> distribution in policy sampling beats other distributions.
Loading