Abstract: Validating and debugging machine learning models is done by testing them on unseen data. During this process, analyzing model performance on various subsets of the test dataset is critical for fairness, trust, bias detection and explainability. We describe a new way to do this. Our solution, InfoMoD, applies recent work in information-theoretic data summarization to model diagnostics. To improve performance, we implemented InfoMoD in a distributed fashion, using Apache Spark. Based on four use cases ranging from finance to computer vision and hate speech detection, we show that InfoMoD concisely describes how a model performs across different subsets of the data and produces expected performance indicators for individual test instances.
External IDs:dblp:journals/dpd/EsmaelizadehCHGT25
Loading