Abstract: Modern deep learning models have the capability to train large datasets effectively for Natural Language Processing (NLP) tasks, resulting in exceptional performance. However, these models are prone to susceptibility when there is a distribution shift between the training and the application data. Hence, it is imperative to develop methods for identifying data that does not conform to the same distribution. This article provides an overview of the out-of-distribution issue and outlines various detection methods, beginning with their mathematical underpinnings. Additionally, the DistilBERT model is modified to incorporate diverse aggregations. Finally, the detectors are applied to multiple datasets and aggregations to make a comparative analysis.
0 Replies
Loading