Abstract: Explainable AI (XAI) offers a growing number of algorithms that aim to answer specific questions about black-box models. What is missing is a principled way to consolidate explanatory information about a fixed black-box model into a persistent, auditable artefact that accompanies the black-box throughout its life cycle. In this conceptual work we address this gap by introducing the notion of a scientific theory of a black box (SToBB). Grounded in Constructive Empiricism, a SToBB fulfils three obligations:
(i) empirical adequacy with respect to all available observations of black-box behaviour, (ii) adaptability via explicit update commitments that restore adequacy when new observations arrive, and (iii) auditability through transparent documentation of assumptions, construction choices, and update behaviour.
We operationalise these obligations as a general framework that specifies an extensible observation base, a traceable hypothesis class, algorithmic components for construction and revision, and documentation sufficient for third-party assessment. Explanations for concrete stakeholder needs are then obtained by querying the maintained record through interfaces, rather than by producing isolated method outputs.
To illustrate the framework we build a step-by-step example for a neural-network on a tabular task.
Together, these contributions position SToBBs as a life cycle-scale, inspectable point of reference that supports consistent, reusable analyses and systematic external scrutiny.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Simone_Scardapane1
Submission Number: 7237
Loading