Abstract: Model-Based Reinforcement Learning (MBRL) has been widely studied for Heating, Ventilation, and Air Conditioning (HVAC) control in buildings. One of the fundamental problems is the large amount of data required to train a neural network for building dynamics modeling. In this paper, we developed CLUE, a safe MBRL HVAC control approach that can achieve low human comfort violation with a dynamics model trained on a small dataset. We used Gaussian Process (GP) as the building dynamics model, which provides the uncertainty of each output. The uncertainty result is then integrated into a safe HVAC control algorithm. Although GP has been studied for HVAC control, this work provides a data-efficient GP modeling method. We designed a novel meta kernel learning technique that incorporates domain knowledge from historical data of multiple buildings to set the GP kernel hyperparameters. Our method can significantly reduce the amount of data required for GP hyperparameter setting. Furthermore, we incorporate the GP-based uncertainty into a Model Predictive Path Integral (MPPI) process to find a safe, energy-efficient action for each control cycle. We generate a large number of action trajectories by the GP building dynamics model, and find the optimal trajectory by a novel MPPI objective function that considers the uncertainty of every action in all trajectories. We then execute the first action of the optimal trajectory. Extensive experiments in a simulated five-zone building show that CLUE only needs seven days of training data to provide comparable energy saving as the state-of-the-art MBRL method, but with less comfort violations. Our code and dataset are available at https://github.com/ryeii/CLUE.
Loading