Evaluating black-box vulnerabilities with Wasserstein-constrained data perturbations

Adriana Laurindo Monteiro; Jean-Michel Loubes

Evaluating black-box vulnerabilities with Wasserstein-constrained data perturbations

Adriana Laurindo Monteiro, Jean-Michel Loubes

Published: 02 Mar 2026, Last Modified: 17 Apr 2026AFAA 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Main Papers Track (6 to 9 pages)

Keywords: Wasserstein projection, explainability, distributional robustness, sensitivity, stress testing

Abstract: The growing use of Machine Learning (ML) tools comes with critical challenges, such as limited model explainability. We propose a global explainability framework that leverages Optimal Transport and Distributionally Robust Optimization to analyze how ML algorithms respond to constrained data perturbations. Our approach enforces constraints on feature-level statistics (e.g., brightness, age distribution), generating realistic perturbations that preserve semantic structure. We provide a model-agnostic diagnostic bench that applies to both tabular and image domains with solid theoretical guarantees. We validate the approach on real-world datasets providing interpretable robustness diagnostics that complement standard evaluation and fairness auditing tools.

Submission Number: 12

Loading