Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming

Philip Sosnin, Jodie Knapp, Fraser Kennedy, Josh Collyer, Calvin Tsay

Published: 2026, Last Modified: 15 Apr 2026CoRR 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This work introduces a verification framework that provides both sound and complete guarantees for data poisoning attacks during neural network training. We formulate adversarial data manipulation, model training, and test-time evaluation in a single mixed-integer quadratic programming (MIQCP) problem. Finding the global optimum of the proposed formulation provably yields worst-case poisoning attacks, while simultaneously bounding the effectiveness of all possible attacks on the given training pipeline. Our framework encodes both the gradient-based training dynamics and model evaluation at test time, enabling the first exact certification of training-time robustness. Experimental evaluation on small models confirms that our approach delivers a complete characterization of robustness against data poisoning.
Loading