Keywords: Safety, Recourse, Interventions, Auditing
TL;DR: We present responsiveness verification and demonstrate how it can be used to make models more reliable.
Abstract: Many safety failures in machine learning arise when models are used to assign
predictions to people – often in settings like lending, hiring, or content moderation –
without accounting for how individuals can change their inputs under realistic
constraints and imperfect data. In this work, we introduce a formal validation
procedure for the responsiveness of predictions with respect to interventions on
their features. Our procedure frames responsiveness as a type of sensitivity analysis
in which practitioners control a set of changes by specifying constraints over
interventions and distributions over downstream effects, allowing uncertainty from
biased, truncated, or missing data to be made explicit. We describe how to estimate
responsiveness for the predictions of any model and any dataset using only black-
box access, and how to use these estimates to support tasks such as falsification and
failure probability estimation. We develop algorithms that construct these estimates
by generating a uniform sample of reachable points, and demonstrate how they
can promote safety in real-world applications such as recidivism prediction, organ
transplant prioritization, and content moderation.
Submission Number: 132
Loading