everyone
since 05 Oct 2023">EveryoneRevisionsBibTeXCC BY 4.0
We have addressed the concerns and requested changes raised from the reviewer pLTm and LUjU. We thank the valuable comments from our reviewers again. The changes are summerized below.
The reviewers raised the concerns regarding the notations. We added a 'Notations and preliminaries' subsection to clarify the notations we used in the theoretical analyses. Such as what is Q(y|x), what is Q(x) etc. We mainly adopt conventions from pure mathematics fields such as: topology and analysis (e.g., real analysis and functional analysis) in tandem with related literature in deep learning community and coalition game theory.
We re-structured the Section 1 and separately added a section "Contributions" to state and clarify our contributions to address the concern from reviewer LUjU. We sincerely thank LUjU for raising the concern.
We agree with the comments from reviewer pLTm regarding the reading flow of the section "Global interpretability" due the newly added content. We have revised the 'Global interpretability' of the Related work to improve the readability flow as per the request from reviewer pLTm. In particular, we used an example from the literature to clarify: (1) What are the limits of providing global interpretability in spatial domain, and, what are (2) what are the benefits of interpreting in frequency domain.
We have revised and restructured the presentation regarding the paragraph 'Information theory in deep learning' of the Related work. The review is designed to provide a background regarding how information theory can be used to justify our work. We have narrowed the review and restructured the presentation into a logic way. We will use some results from information theory to show that IASIDE itself is theoretically interpetable at design level.
We moved the B.A. bound proof to appendix.
We corrected some errors spotted by reviewer pLTm.
We updated supplementary material to provide extra experiments requested by reviewer LUjU. We thank LUjU for raising the concern.
We will further improve the readablity in the final version and correct gramaticall errors.
We moved some contents of the captions to main contexts.
We sincerely appreciate the huge help and concerns from all reviewers.
We also increased the font size for some figures to optimize the visualization and aesthetics.
We also made other changes to improve the logic and readability flow.
We have addressed the concerns and changes raised from reviewer pLTm and K5tv. We also improved the readability of the manuscript and further strengthened the justification by showing more results. The revisions are highlighted as blue fonts. The manuscript is v8 and has the following major changes:
(1). We changed the Strong Efficiency axiom into usual Efficiency axiom, and accordingly reformulated our approach with rigorous coalition game theory. We thus directly deploy Shapley value theory as the decomposition equation. The Strong Efficiency axiom -- any subset of players hold usual Efficiency axiom -- will lead to an approximate estimation to Shapley value theory. Starting from the strong efficiency assumption, there may not exist an exact solution satisfying the derived linear equation system -- CV=R -- since C is not a full rank matrix. In fact, our prior decomposition equation with Strong Efficiency in prior manuscript v7 is equivalent to the method (Additive Importance Measures) in literature 'Understanding Global Feature Contributions With Additive Importance Measures'. Please refer to Section 3.1 to Section 3.4 in our updated manuscript v8.
(2). We adjusted, and theoretically characterized the characteristic function in v8. We theoretically show that IASIDE decomposes the mutual information between feature representations and labels. In tandem with Shapley value theory, the justification gives a sound theoretical guarantee to our method. Please refer to Section 3.4. We also provided a brief theoretical proof in Section 3.4 as well. The similar proof technique can be found in multiple literature.
(3). We extended the related work section according to the requests and references provided by our viewers. We added a brief review regarding "mutual information in deep learning" to provide a theoretical background for the theoretical characterization of the characteristic function in Section 3.4.
(4). We revised the mathematical notations into a more usual way in coalition game theory and math context.
(5). We re-run all experiments. Our major results still hold.
(6). We changed the 3D visualization into 2D heat map visualization for better understanding. Please refer to Figure 7.
(7). In the experiments of learning dynamics. We added the training/validation loss curves as references. The loss curves can provide information such as at which epochs the over-fitting emerges.
...
Regarding the changes of v8, please refer to the revision history.