Understanding the Latent Spaces of OOD Detectors: A Study of Mahalanobis and IRW Approaches

Benjamin Maurel, Louise Demoor

19 Mar 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: The objective of this project is to address concerns regarding the reliability of large neural networks in the field of Natural Language Processing (NLP), despite their impressive performance in recent years. The primary goal is to develop more resilient algorithms that can enhance the ability of NLP systems to resist data drifts and adversary attacks. While cutting-edge models can perform exceptionally well on input data that is similar to their training datasets, they can become ineffective in NLP scenarios due to the continuously changing nature of languages and distributional shifts. To tackle this challenge, the project proposes a methodology for measuring and identifying distributional shifts in different corpus/sentences by analyzing the latent representations of tokens. This analysis can be carried out using classical discrepancy measure tools, which are tailored to the high-dimensional nature of transformers layers. This research is crucial for promoting the responsible application of promising NLP methods in critical systems, where robustness is a crucial consideration. In this project, the focus is on exploring the usefulness of incorporating information from all the layers to improve Out of Distribution detectors. All our experiments and figures can be reproduced thanks to our code provided in our GitHub https://github.com/BenJMaurel/NLP_project

0 Replies