TL;DR: We propose fast differentiable neural layers that turn unconstrained network predictions into outputs satisfying hard linear constraints by solving implicit convex optimization problems.
Abstract: One essential limitation of neural networks is how to enforce (hard) constraints on prediction. We propose a plug-in, differentiable layer, which involves a fast implicit (convex) optimization procedure to enforce the general linear constraint. It aims to minimize a divergence between unconstrained and constrained outputs. Connecting to and beyond existing handcrafted layers, we show that our layer degrades to classic layers like Softmax, Sinkhorn and tanh etc. when the corresponding constraint is enforced by KL-divergence minimization. We further show that by replacing the KL-div with a Euclidean distance, a closed-form solution can be derived for highly-efficient constraint enforcing. We evaluate the above two variants of layers, termed as BLCLayer and GLCLayer, with their corresponding neural solver BLCNet and GLCNet with simple MLP/GNN-like backbone. Experiments on linear programming, as well as two real-world problems: partial graph matching and portfolio allocation which involve other discrete constraints.
Lay Summary: Many AI systems are used to make predictions in settings where an answer is only useful if it obeys strict rules, such as budgets, capacities, or one-to-one matching requirements. Standard neural networks are powerful, but they can output answers that violate these rules, and fixing such violations after prediction can be slow or unreliable.
Our paper asks how to make rule-satisfaction part of the network itself. We let the network first make an unconstrained prediction, then pass it through a trainable layer that moves the prediction to a nearby valid answer. This view also explains familiar tools such as Softmax and Sinkhorn as special cases of the same idea. Based on it, we design two layers: one for selection-like nonnegative rules and one for more general linear rules.
We test them on graph matching, portfolio allocation, and linear programming. The results show that neural networks can produce outputs that are both accurate and valid by construction, which is important for decision-making problems where breaking the rules is not acceptable.
Primary Area: Optimization->Convex
Keywords: Linear Constrained Neural Layer, Convex Optimization
Originally Submitted PDF: pdf
Submission Number: 11487
Loading