When Less is More: Simplifying Inputs Aids Neural Network Understanding

TMLR Paper1339 Authors

30 Jun 2023 (modified: 13 Nov 2023)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: How do neural network image classifiers respond to simpler and simpler inputs? And what do such responses reveal about the characteristics of the data and their interaction with the learning process? To answer these questions, we need measures of input simplicity (or inversely, complexity) and class-discriminative input information as well as a framework to optimize these during training. Lastly, we need experiments that evaluate whether this framework can simplify data to remove injected distractors and evaluate the impact of such simplification on real-world data. In this work, we measure simplicity with the encoding bit size given by a pretrained generative model, and minimize the bit size to simplify inputs during training. At the same time, we minimize a gradient- and activation-based distance metric between original and simplified inputs to retain discriminative information. We investigate the trade-off between input simplicity and task performance. For images with injected distractors, such simplification naturally removes superfluous information. For real-world datasets, such simplification retains visually discriminative features and learns features that may be more robust in some settings.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Rewrote section 3/4 for clarity, including adding pseudocode for section 4. Toned down claims with regards to how much discriminative information is still present in the images. Added Supplementary Section about ViTs. Added cross-architecture experiments.
Assigned Action Editor: ~ERIC_EATON1
Submission Number: 1339
Loading