Leveraging latent representations for efficient textual OOD detection

Lilian Marey, Lucas Saban

20 Mar 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: With the democratization of Large-Language-Models through the release of products such as ChatGPT, the public is being more and more sensitive to flaws known for a long time in the community. An example of that is the overconfidence in the responses to Out-Of-Distribution (OOD) queries. Those are likely to be non-relevant and be detrimental to the public sentiment. However, this could be prevented by OOD detectors which are able to determine whether the model will be able to produce a satisfactory answer. We focus our work on out-of-the-bag detectors that are model-independent and do not leverage the model structure, but only an input's latent representations. We first reproduce the results of Colombo et al. (Beyond Mahalanobis-Based Scores for Textual OOD Detection) and Guerreiro et al. (Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation) on a restricted benchmark, then we argue that a better leverage of latent representations can lead to improved performances. To this extent, we introduce an exponential-based and an euclidean-distance-based methods to make our point. The code leading to the experimental results is available on github.com/lilianmarey/nlp_ood_detection .

0 Replies