Keywords: Safety, Recourse, Interventions, Auditing
TL;DR: We demonstrate how to use statistical inference to audit the responsiveness of a machine learning model.
Abstract: Many safety failures in machine learning arise when models assign predictions
to people – e.g., in lending, hiring, or content moderation – without accounting
for how individuals can change their inputs. We introduce a formal validation
procedure for the responsiveness of predictions with respect to interventions on
their features. Our procedure frames responsiveness as a type of sensitivity analysis
in which practitioners control a set of changes by specifying constraints over
interventions and distributions over downstream effects. We describe how to
estimate responsiveness for the predictions of any model and any dataset using only
black-box access, and design algorithms that use these estimates to support tasks
such as falsification and failure probability analysis. The resulting audits uncover
the problem at hand and enable community or regulatory oversight: when lack (or
excess) of responsiveness is negligible, off-the-shelf models suffice; when material,
findings motivate redesign (e.g., strategic classification) or policy changes. We
demonstrate these safety benefits and illustrate how collective stakeholders can
help steer AI systems.
Submission Number: 28
Loading