Abstract: Highlights•A novel approach for configuring the reward function of DRL-based models.•Our approach enables us to define desired values for various metrics–TPR/FPR, etc.•The DRL model automatically adapt its behavior to reach the various metrics.•A process for “transferring” effective policies from one domain to another.•Out proposed approach is highly robust against adaptive adversarial attacks.
Loading