Formalizing Audits of ML Models as a Sequential Decision-Making Problem

ICLR 2026 Conference Submission15010 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: XAI, ML model audits, application audits, model explanations
TL;DR: We formalize ML model application audits as a sequential problem and propose a novel solution that automates the audit process, thereby enhancing the scalability of application audits.
Abstract: Auditing is a governance mechanism for evaluating ML models to identify and mitigate potential risks. This process is critical, as undetected issues in models, such as incorrect predictions or inappropriate feature use, can lead to adverse consequences. In this work, we focus on application audits, which aim to detect errors in domain-specific ML applications. Application audits are important as they assess the risks posed by ML models to guide mitigation. Currently, application audits are predominantly manual, relying on domain experts to identify model errors by inspecting predictions and their explanations, which limits the scalability of audits. To complement human auditors, we explore algorithmic approaches to application auditing and formalize the auditing task as a sequential decision-making problem. We propose SAFAAI, a novel conceptual framework for auditing, inspired by principles of situational awareness, to formally define the objectives of application audit problem. Building on this foundation, we introduce RLAuditor, a reinforcement learning method for automating application audits of ML models. We validate our approach on multiple ML models and datasets, both with and without human auditors, demonstrating its effectiveness in facilitating audits across different contexts. To our knowledge, this work is the first to formalize application audits for ML models as a sequential decision-making problem, informing the design of future automated and human-AI collaborative auditing approaches.
Primary Area: interpretability and explainable AI
Submission Number: 15010
Loading