I-ASIDE: Towards the Global Interpretability of Image Model Robustness through the Lens of Axiomatic Spectral Importance Decomposition
Abstract: Robust decisions leverage a high proportion of robust features. Natural images have spectral non-uniformity and the majority of spectral energy is concentrated on low-frequency components. A change with an infinitesimal amount of energy on the high-frequency components can rewrite the raw features dominated by high-frequency components. Image models are parameterized general non-linear signal filters. The spectral structures of the model responses to inputs determines the fragility of the learned feature representations. The spectral importance decomposition of models can thus reflect model robustness in response to feature perturbations. To this end, we formulate the spectral importance decomposition problem, and, present Image Axiomatic Spectral Importance Decomposition Explanation (I-ASIDE) -- a model-agnostic global interpretability method -- to quantify model global robustness and understand how models respond to perturbations. We theoretically show that I-ASIDE decomposes the mutual information between feature representations and labels onto spectrum}. Our approach provides a unique insight into interpreting model global robustness from the perspective of information theory and enables a considerable number of applications in research, from understanding model robustness, to studying learning dynamics, to assessing label noise, to investigating adversarial vulnerability, to studying out-of-distribution robustness, etc. We showcase multiple applications to endorse these claims.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=D2qMFfYlYb
Changes Since Last Submission: # Important changes on v9:
We have addressed the concerns and requested changes raised from the reviewer pLTm and LUjU. We thank the valuable comments from our reviewers again. The changes are summerized below.
1. The reviewers raised the concerns regarding the notations. We added a 'Notations and preliminaries' subsection to clarify the notations we used in the theoretical analyses. Such as what is Q(y|x), what is Q(x) etc. We mainly adopt conventions from pure mathematics fields such as: topology and analysis (e.g., real analysis and functional analysis) in tandem with related literature in deep learning community and coalition game theory.
2. We re-structured the Section 1 and separately added a section "Contributions" to state and clarify our contributions to address the concern from reviewer LUjU. We sincerely thank LUjU for raising the concern.
3. We agree with the comments from reviewer pLTm regarding the reading flow of the section "Global interpretability" due the newly added content. We have revised the 'Global interpretability' of the Related work to improve the readability flow as per the request from reviewer pLTm. In particular, we used an example from the literature to clarify: (1) What are the limits of providing global interpretability in spatial domain, and, what are (2) what are the benefits of interpreting in frequency domain.
4. We have revised and restructured the presentation regarding the paragraph 'Information theory in deep learning' of the Related work. The review is designed to provide a background regarding how information theory can be used to justify our work. We have narrowed the review and restructured the presentation into a logic way. We will use some results from information theory to show that IASIDE itself is theoretically interpetable at design level.
5. We moved the B.A. bound proof to appendix.
6. We corrected some errors spotted by reviewer pLTm.
7. We updated supplementary material to provide extra experiments requested by reviewer LUjU. We thank LUjU for raising the concern.
8. We will further improve the readablity in the final version and correct gramaticall errors.
9. We moved some contents of the captions to main contexts.
10. We sincerely appreciate the huge help and concerns from all reviewers.
11. We also increased the font size for some figures to optimize the visualization and aesthetics.
12. We also made other changes to improve the logic and readability flow.
# Important changes on v8:
We have addressed the concerns and changes raised from reviewer pLTm and K5tv. We also improved the readability of the manuscript and further strengthened the justification by showing more results. The revisions are highlighted as blue fonts. The manuscript is v8 and has the following major changes:
(1). We changed the Strong Efficiency axiom into usual Efficiency axiom, and accordingly reformulated our approach with rigorous coalition game theory. We thus directly deploy Shapley value theory as the decomposition equation. The Strong Efficiency axiom -- any subset of players hold usual Efficiency axiom -- will lead to an approximate estimation to Shapley value theory. Starting from the strong efficiency assumption, there may not exist an exact solution satisfying the derived linear equation system -- CV=R -- since C is not a full rank matrix. In fact, our prior decomposition equation with Strong Efficiency in prior manuscript v7 is equivalent to the method (Additive Importance Measures) in literature 'Understanding Global Feature Contributions With Additive Importance Measures'. Please refer to Section 3.1 to Section 3.4 in our updated manuscript v8.
(2). We adjusted, and theoretically characterized the characteristic function in v8. We theoretically show that IASIDE decomposes the mutual information between feature representations and labels. In tandem with Shapley value theory, the justification gives a sound theoretical guarantee to our method. Please refer to Section 3.4. We also provided a brief theoretical proof in Section 3.4 as well. The similar proof technique can be found in multiple literature.
(3). We extended the related work section according to the requests and references provided by our viewers. We added a brief review regarding "mutual information in deep learning" to provide a theoretical background for the theoretical characterization of the characteristic function in Section 3.4.
(4). We revised the mathematical notations into a more usual way in coalition game theory and math context.
(5). We re-run all experiments. Our major results still hold.
(6). We changed the 3D visualization into 2D heat map visualization for better understanding. Please refer to Figure 7.
(7). In the experiments of learning dynamics. We added the training/validation loss curves as references. The loss curves can provide information such as at which epochs the over-fitting emerges.
...
Regarding the changes of v8, please refer to the revision history.
Assigned Action Editor: ~Jeremias_Sulam1
Submission Number: 1613
Loading