Track: Full Paper (8 pages)
Keywords: convolutional neural networks, weight matrices, interpretability, image property prediction
TL;DR: Work seeks to understand the mathematical fingerprints left on the learned weights of CNNs by properties of the training data images.
Abstract: The ability to understand deep learning models by analyzing their weights is key to advancing the growing field of model interpretability. In this article, we study information about the training data of convolutional neural network (CNN) models that can be gleaned from analyzing just the first of their learned filters. While gradient updates to the model weights during training become increasingly complex in the deeper layers of typical CNNs, the updates to the initial layer can be simple enough that high-level dataset properties such as image sharpness, noisiness, and color distribution are prominently featured. We give a simple mathematical justification for this and demonstrate how training dataset properties appear in this way for several standard CNNs on a number of datasets.
Supplementary Material: zip
Submission Number: 12
Loading