Evaluating black-box vulnerabilities with Wasserstein-constrained data perturbations

Published: 02 Mar 2026, Last Modified: 02 Mar 2026AFAA 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Main Papers Track (6 to 9 pages)
Keywords: Wasserstein projection, explainability, distributional robustness, sensitivity, stress testing
Abstract: The growing use of Machine Learning (ML) tools comes with critical challenges, such as limited model explainability. We propose a global explainability framework that leverages Optimal Transport and Distributionally Robust Optimization to analyze how ML algorithms respond to constrained data perturbations. Our approach enforces constraints on feature-level statistics (e.g., brightness, age distribution), generating realistic perturbations that preserve semantic structure. We provide a model-agnostic diagnostic bench that applies to both tabular and image domains with solid theoretical guarantees. We validate the approach on real-world datasets providing interpretable robustness diagnostics that complement standard evaluation and fairness auditing tools.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 12
Loading