Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings

Angéline Pouget; Mohammad Yaghini; Stephan Rabanser; Nicolas Papernot

Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings

Angéline Pouget, Mohammad Yaghini, Stephan Rabanser, Nicolas Papernot

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 oralEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: The suitability filter detects performance degradation in machine learning models by comparing accuracy on unlabeled user data and labeled test data.

Abstract: Deploying machine learning models in safety-critical domains poses a key challenge: ensuring reliable model performance on downstream user data without access to ground truth labels for direct validation. We propose the _suitability filter_, a novel framework designed to detect performance deterioration by utilizing _suitability signals_—model output features that are sensitive to covariate shifts and indicative of potential prediction errors. The suitability filter evaluates whether classifier accuracy on unlabeled user data shows significant degradation compared to the accuracy measured on the labeled test dataset. Specifically, it ensures that this degradation does not exceed a pre-specified margin, which represents the maximum acceptable drop in accuracy. To achieve reliable performance evaluation, we aggregate suitability signals for both test and user data and compare these empirical distributions using statistical hypothesis testing, thus providing insights into decision uncertainty. Our modular method adapts to various models and domains. Empirical evaluations across different classification tasks demonstrate that the suitability filter reliably detects performance deviations due to covariate shift. This enables proactive mitigation of potential failures in high-stakes applications.

Lay Summary: Machine learning models learn from data to make decisions, but it can be tricky to ensure they remain dependable when they encounter new, real-world situations. This research introduces a new way to check if these models are starting to make more mistakes with new data, particularly when we can't easily verify whether their decisions are correct. The method works by examining subtle clues in how the model behaves with both familiar and new data to detect if its decision-making quality has declined. Experiments showed this approach can successfully flag when a model is struggling because the new information is different from what it was prepared for. This helps build confidence that these machine learning models are working correctly and can be trusted, especially in important everyday applications.

Link To Code: https://github.com/cleverhans-lab/suitability

Primary Area: Social Aspects->Everything Else

Keywords: suitability, reliability, robustness, classifier, unlabeled data

Submission Number: 1934

Loading