Changes
In the Appendix, we have added a similar experiment to the one Figure 5, but varying the depth(complexity) of the Explanation Shift Detector.
We have also done a full grammar check on the paper.
We have also implemented the following requests:
This claim is not supported with clear evidence in this paper. I suggest toning down the sentence: as such domain assumptions may be available in the domain of images but are rarely or never available for tabular data.
[Check]
Please define dom()
[Check]
In Definition 2.1,D should be D_X ?
[Check]
The following part seems overly complicated if we use empirical distributions
[Check]
In Definition 2.3, the first has a different font than other ones
[Check]
Please define all the acronyms (OOD, KS, etc.).
[Check]
In and above Eq. (1), val should be val_f,x?
[Check]
I'm not sure if this information is already somewhere in the paper, but if not, it would be nice to add some more details on the task and the dataset used in Section 5.4.
[Check]
In Figure 3, are the legend for "Input, f = XGB" and "Input, f = Log" correct? Their lines and the circles seem to be showing the same plot
Yes, its the same plot. Distribution shifts on input data are independent of the estimator $f_\theta$ used. So the results are identical.
Typos and Formatting Errors from reviewer 'Crub'
[Check]
Gradient boosting decision tree / gradient-boosting decision tree ⇒ Should be consistently referred to as gradient-boosted decision tree.
[Check]
Avoid Redundancy: Repeating information multiple times should be avoided [Check]
Gradient boosting decision tree / gradient-boosting decision tree ⇒ Should be consistently referred to as gradient-boosted decision tree.
[check]
Avoid Redundancy: Repeating information multiple times should be avoided. For instance, the table caption text like “Displayed results are the one-tailed p-values of the Kolmogorov-Smirnov test comparison between two underlying distributions” or “Novel Covariate Group Shift for the 'Asian' group with a fraction ratio of 0.5 as described in Section 5” only needs to be mentioned once in the main text.
[check]
Equations formatting from reviewer 'Crub' and Action Editor 'hxK6'. [check] I don't think the following argument is very accurate. The optimal classification rule for dog images may change if the species evolves and they start to look different. Furthermore, for the second example, we could think about a task of predicting buying behaviour based on images, to say there will be a concept shift for image classification by the same argument. In any case, giving a few examples like these will not serve as a proof of the authors' statement about the relationships between variables only based on the type of data. It is fine to focus on tabular data, but please try to avoid making claims without solid evidence.
[Check]
Changes 2.0
Updated definition 2.5 to $\frac{P(\D^{tr}_Y , \D^{tr}_X)}{P(\D^{tr}_X)}$
Changes 2.1
In pg 8, updated to $\frac{P(\D^{new}_Y),\D^{new}_X)}{P(\D^{new}_X)}$
Updated to "for $j \in {1, 2}$" in Example 4.1
Removed"for $i \in {1, 2}$" in Example 4.3.
Added missing end paragraph punctuations.