Keywords: auditing deep learning, verification, interpretability
Abstract: Auditing trained deep learning (DL) models prior to deployment is vital for preventing unintended consequences. One of the biggest challenges in auditing is in understanding how we can obtain human-interpretable specifications that are directly useful to the end-user. We address this challenge through a sequence of semantically-aligned unit tests, where each unit test verifies whether a predefined specification (e.g., accuracy over 95\%) is satisfied with respect to controlled and semantically aligned variations in the input space (e.g., in face recognition, the angle relative to the camera). We perform these unit tests by directly verifying the semantically aligned variations in an interpretable latent space of a generative model by building a bridge with the DL model. Our framework, AuditAI, bridges the gap between interpretable formal verification and scalability. With evaluations on four different datasets, covering images of chest X-rays, human faces, ImageNet classes, and towers, we show how AuditAI allows us to obtain controlled variations for verification and certified training. We address the limitations of the standard approach of verifying using only pixel-space perturbations.
One-sentence Summary: We enable auditing deep learning models by building a generative bridge that enables verification of semantically aligned specifications
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 7 code implementations](https://www.catalyzex.com/paper/arxiv:2109.12456/code)
14 Replies
Loading