Abstract: We study out-of-distribution (OOD) prediction behavior of neural networks when they classify images from unseen classes or corrupted images. To probe the OOD behavior, we introduce a new measure, nearest category generalization (NCG), where we compute the fraction of OOD inputs that are classified with the same label as their nearest neighbor in the training set. Our motivation stems from understanding the prediction patterns of adversarially robust networks, since previous work has identified unexpected consequences of training to be robust to norm-bounded perturbations. We find that robust networks have consistently higher NCG accuracy than natural training, even when the OOD data is much farther away than the robustness radius. This implies that the local regularization of robust training has a significant impact on the network’s decision regions. We replicate our findings using many datasets, comparing new and existing training methods. Overall, adversarially robust networks resemble a nearest neighbor classifier when it comes to OOD data.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Other changes:
- Added Figure 1 for clarity.
Comments addressed:
`1) According to Reviewer Svr8, the metric and it's properties need improved motivation and explanation and many things are unclear to a reader (see our reviews and discussions). Furthermore, various interpretations of empirical results do not appear to be well supported by the actual results, necessitating improved presentation of the results.`
We have added more discussion on the exemplar-based generalization to motivate NCG in the new Introduction section.
`2.1) According to Reviewer TdWQ, 2.1) "NCG accuracy provides a new way to evaluate training methods on unlabeled data that cannot be labeled using other information": If "evaluate" means to measure some goodness, this might be an overstatement.`
We tone down this sentence with "understand a model's prediction" instead of "evaluate" (in Section 1.1).
`2.2) "This indicates that having higher NCG accuracy may be a desirable property, as it can encourage better predictions on corrupted data.": This sentence sounds like the causal statement that "making NCG accuracy higher leads to better predictions on corrupted data." This is not confirmed by the presented experiments. Consider adding experiments or adjusting the statement.`
We replace "it can encourage better predictions" with "it is positively correlated with better predictions", which is a more accurate statement.
`2.3) "Nonetheless, they both suffice to provide insight into the OOD prediction behavior of robust and normal networks." This sounds as if there were nothing further to be studied.`
We have revised it as "Nonetheless, they both {\color{ForestGreen} can be used by future work} to provide insight into the OOD prediction behavior of robust and normal networks."
`2.4) Consider rephrasing "accuracy" in the term "NCG accuracy". Accuracy sounds like a goodness measure of performance on a task that we wish to solve.`
We replace all NCG accuracy as NCG score.
`2.5) "NCG can be much easier" than OOD detection: This can mean the OOD detection problem cannot be reduced to NCG. If so, that would be a contradiction. Please consider adding comments on that.`
We rephrase this sentence as "our result suggests that achieving a high NCG score can be much easier than achieving a good detection rate in some cases".
`3) According to Reviewer eC2G, the manuscript does not fully support the validity of NCG accuracy. The authors claim "robust networks are more likely to classify OOD data with the class label of the nearest training input", which seems a reasonable claim given the experimental results. However, robust networks could classify in-distribution data with the class label of the nearest training input as well. If this happened, NCG accuracy would not be sufficiently informative to probe OOD data. I would like to see whether this actually happens or not if possible.`
We have added an experiment for this in Appendix D.5. In short, robust networks do have a slightly higher NCG score on in-distribution data than naturally train networks. However, this increase in NCG score from natural to robust training is smaller for in-distribution data than for OOD data. This means that our claim not only holds for OOD data, but also holds for in-distribution data. However, the phenomenon of NCG is more prominent on OOD data.
Code: https://github.com/yangarbiter/nearest-category-generalization
Assigned Action Editor: ~bo_han2
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 567
Loading