Great Minds Think Alike: The Universal Convergence Trend of Input Salience

Yipei Wang; Jeffrey Mark Siskind; Xiaoqian Wang

Great Minds Think Alike: The Universal Convergence Trend of Input Salience

Yipei Wang, Jeffrey Mark Siskind, Xiaoqian Wang

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: explainable artificial intelligence, saliency maps, model distributions

TL;DR: Leveraging input saliency maps, we discover that with increasing capacities, the distributions of models converge to the almost-shared population mean, and thus the limiting of model can be estimated through the population mean of small models.

Abstract: Uncertainty is introduced in optimized DNNs through stochastic algorithms, forming specific distributions. Training models can be seen as random sampling from this distribution of optimized models. In this work, we study the distribution of optimized DNNs as a family of functions by leveraging a pointwise approach. We focus on the input saliency maps, as the input gradient field is decisive to the models' mathematical essence. Our investigation of saliency maps reveals a counter-intuitive trend: two stochastically optimized models tend to resemble each other more as either of their capacities increases. Therefore, we hypothesize several properties of these distributions, suggesting that (1) Within the same model architecture (e.g., CNNs, ResNets), different family variants (e.g., varying capacities) tend to align in terms of their population mean directions of the input salience. And (2) the distributions of optimized models follow a convergence trend to their shared population mean as the capacity increases. Furthermore, we also propose semi-parametric distributions based on the Saw distribution to model the convergence trend, satisfying all the counter-intuitive observations. Our experiments shed light on the significant implications of our hypotheses in various application domains, including black-box attacks, deep ensembles, etc. These findings not only enhance our understanding of DNN behaviors but also offer valuable insights for their practical application in diverse areas of deep learning.

Supplementary Material: zip

Primary Area: Interpretability and explainability

Submission Number: 11460

Loading