Optimization Proxies using Limited Labeled Data and Training Time -- A Semi-Supervised Bayesian Neural Network Approach
TL;DR: Leverage feasibility to train optimization proxies more efficiently by augmenting unlabeled datasets with feasibility labels. Enabled by a sandwich-style training approach, reducing the need for extensive labeled data.
Abstract: Constrained optimization problems arise in various engineering systems such as inventory management and power grids. Standard deep neural network (DNN) based machine learning proxies are ineffective in practical settings where labeled data is scarce and training times are limited. We propose a semi-supervised Bayesian Neural Networks (BNNs) based optimization proxy for this complex regime, wherein training commences in a sandwiched fashion, alternating between a supervised learning step for minimizing cost, and an unsupervised learning step for enforcing constraint feasibility. We show that the proposed semi-supervised BNN outperforms DNN architectures on important non-convex constrained optimization problems from energy network operations, achieving up to a tenfold reduction in expected maximum equality gap and halving the inequality gaps. Further, the BNN's ability to provide posterior samples is leveraged to construct practically meaningful probabilistic confidence bounds on performance using a limited validation data, unlike prior methods.
Lay Summary: Constrained optimization problems arise frequently when we try to optimize certain features of a system (like cost of operation or power produced) while trying to stay feasible i.e., within valid operating parameters. Optimization algorithms can find the optimum of such problems, but they get prohibitively expensive as the system size increases. Hence it is necessary to look to ML-based approaches that reuse past solutions to predict solutions to new instances of the problem almost instantly. In this work, we propose a semi-supervised approach to train Bayesian Neural Networks (BNN) as ML surrogates for constrained optimization problems. BNNs, compared to vanilla neural nets, are known for their superior performance in the low-data regime. We train BNN in an alternating fashion, first training it purely on supervised data (instances of solved optimization problems) and then on unsupervised data (data with un-optimized, but feasible instances). This alternating pattern is repeated till convergence. We test our method on publicly available datasets on electrical power flow optimization and find that this method works exceedingly well compared to other methods especially in the low-data and time-constrained setting. BNNs, being inherently probabilistic, produce a distribution of predictions for every problem instance. We leverage this property to give confidence intervals for the optimal operating point predicted by BNN. This allows practitioners to see not just solutions, but also how confident the model is about a solution. This property makes our approach ideal for safety-critical applications like power flow optimization.
Link To Code: https://github.com/kaarthiksundar/BNN-OPF
Primary Area: Probabilistic Methods->Bayesian Models and Methods
Keywords: Bayesian Neural Network, Optimization Proxy, Semi-Supervised Learning
Submission Number: 9628
Loading