Adversarial poisoning attacks on reinforcement learning-driven energy pricingDownload PDFOpen Website

2022 (modified: 17 Apr 2023)BuildSys@SenSys 2022Readers: Everyone
Abstract: Complex controls are increasingly common in power systems. Reinforcement learning (RL) has emerged as a strong candidate for implementing various controllers. One common use of RL in this context is for prosumer pricing aggregations, where prosumers consist of buildings with both solar generation and energy storage. Specifically, supply and demand data serve as the observation space for many microgrid controllers acting based on a policy passed from a central RL agent. Each controller outputs an action space consisting of hourly "buy" and "sell" prices for energy throughout the day; in turn, each prosumer can choose whether to transact with the RL agent or the utility. The RL agent, who is learning online, is rewarded through its ability to generate a profit. We ask: what happens when some of the microgrid controllers are compromised by a malicious entity? We demonstrate a novel attack in RL and a simple defense against the attack. Our attack perturbs each trajectory to reverse the direction of the estimated gradient. We demonstrate that if data from a small fraction of microgrid controllers is adversarially perturbed, the learning of the RL agent can be significantly slowed. With larger perturbations, the RL aggregator can be manipulated to learn a catastrophic pricing policy that causes the RL agent to operate at a loss. Other environmental characteristics are worsened too: prosumers face higher energy costs, use their batteries less, and suffer from higher peak demand when the pricing aggregator is adversarially poisoned. We address this vulnerability with a "defense" module; i.e., a "robustification" of RL algorithms against this attack. Our defense identifies the trajectories with the largest influence on the gradient and removes them from the training data. It is computationally light and reasonable to include in any RL algorithm.
0 Replies

Loading