Exploring the Spatial Dynamics of In-Distribution and Out-of-Distribution Data in Logit Space

Exploring the Spatial Dynamics of In-Distribution and Out-of-Distribution Data in Logit Space

TMLR Paper4171 Authors

10 Feb 2025 (modified: 27 Apr 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Out-of-distribution (OOD) data pose a significant challenge to deep learning (DL) classifiers, prompting extensive research into their effective detection methods Current state-of-the-art OOD detection methods employ a scoring technique designed to assign lower scores to OOD samples compared to in-distribution (ID) ones. Nevertheless, these approaches lack foresight into the configuration of OOD and ID data within the latent space, instead making an implicit assumption regarding their inherent separation. As a result, most OOD detection methods result in complicated and hard-to-validate scoring techniques. This study conducts a thorough analysis of the logit embedding landscape, revealing that both ID and OOD data exhibit a distinct trend. Specifically, we demonstrate that OOD data tends to reside near to the center of the logit space. In contrast, ID data tends to be situated farther from the center, predominantly in the positive regions of the logit space, thus forming class-wise clusters along the orthogonal axes that span the logit space. This study highlights the critical role of the DL classifier in differentiating between ID and OOD logits.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: In response to the comments, we have made the following revisions to improve the manuscript: 1. **Assumption refinement**: We have updated our theoretical assumption regarding the relationship between out-of-distribution (OOD) and in-distribution (ID) data relative to the deep learning (DL) model, ensuring better alignment with empirical observations. 2. **Corollary update**: Following the revised assumption, we have correspondingly updated Corollary 1 to maintain consistency in our theoretical framework. 3. **Abstract**: As suggested by Reviewer Tuv2, we have refined the abstract to more clearly articulate our contributions and key findings. 4. **Related work**: The Related Methods section has been expanded to provide a more comprehensive discussion of prior work, ensuring proper context for our approach. 5. **Supplementary analysis**: We have included a new observation in the Supplementary Material regarding ID-in behavior under input MixUp (mixing two different-class samples). This analysis demonstrates that MixUp-generated near-OOD samples exhibit logits shifted toward the center of the logit space, further supporting our insights. 6. **Improved plot explanations**: Additional clarification has been provided for all key plots, explicitly stating their purpose and the conclusions drawn from them. 7. **Expanded experimental discussion**: Each Experiment section now includes a more detailed discussion of results, their implications, and connections to our theoretical claims. 8. **Late Update**: Added Legend in Figure 5 We believe these revisions have significantly strengthened the paper, and we thank each reviewer for the time and the constructive comments and suggestions. Please let us know if any further clarifications.

Assigned Action Editor: ~Jasper_Snoek1

Submission Number: 4171

Loading