Abstract: Although deep learning (DL) models have revolutionized the field of machine learning (ML), these classification models cannot easily distinguish the in-distribution (ID) versus the out-of-distribution (OOD) data at the test phase.
This paper analyzes the landscape of ID and OOD data embeddings and demonstrates that OOD data is always embedded toward the center in the logit space.
Furthermore, IDs data are embedded far from the center towards the positive regions of the logit space, thus ensuring minimal overlap between ID and OOD embeddings.
Based on these observations, we propose to make the classification model sensitive to the OOD data by incorporating the configuration of the logit space into the predictive response.
Hence, we estimate the distribution of the ID logits by utilizing a density estimator over the training data logits.
Our proposed approach is data and architecture-agnostic and could be easily incorporated with a trained model without exposure to OOD data.
We ran experiments on the popular image datasets and obtained state-of-the-art performance and an improvement of up to 10$\%$ on AUCROC on the Google genome dataset.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: 1. Added experiment on Denset
2. Added experiment on CIFAR-100
3. Added experiments on different activation functions that suppress the negative values.
5. Addopted the suggested writing and rearrangements in the main text and Appendix.
Assigned Action Editor: ~bo_han2
Submission Number: 1561
Loading