Detecting Systematic Weaknesses in Vision Models along Predefined Human-Understandable Dimensions

Published: 24 Jul 2025, Last Modified: 24 Jul 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Slice discovery methods (SDMs) are prominent algorithms for finding systematic weaknesses in DNNs. They identify top-k semantically coherent slices/subsets of data where a DNN-under-test has low performance. For being directly useful, slices should be aligned with human-understandable and relevant dimensions, which, for example, are defined by safety and domain experts as part of the operational design domain (ODD). While SDMs can be applied effectively on structured data, their application on image data is complicated by the lack of semantic metadata. To address these issues, we present an algorithm that combines foundation models for zero-shot image classification to generate semantic metadata with methods for combinatorial search to find systematic weaknesses in images. In contrast to existing approaches, ours identifies weak slices that are in line with predefined human-understandable dimensions. As the algorithm includes foundation models, its intermediate and final results may not always be exact. Therefore, we include an approach to address the impact of noisy metadata. We validate our algorithm on both synthetic and real-world datasets, demonstrating its ability to recover human-understandable systematic weaknesses. Furthermore, using our approach, we identify systematic weaknesses of multiple pre-trained and publicly available state-of-the-art computer vision DNNs.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: - camera ready version
Code: https://github.com/sujan-sai-g/Systematic-Weakness-Detection
Supplementary Material: zip
Assigned Action Editor: ~Quanshi_Zhang1
Submission Number: 4415
Loading