Keywords: Out-of-distribution Detection, Natural Language Generation, Selective Generation
Abstract: Much work has shown that high-performing ML classifiers can degrade significantly and provide overly-confident, wrong classification predictions, particularly for out-of-distribution (OOD) inputs. Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence, and may suffer even worse degradation on out-of-distribution (OOD) inputs as the prediction is done auto-regressively over many steps. We present a highly accurate and lightweight OOD detection method for CLMs, and demonstrate its effectiveness on abstractive summarization and translation. We also show how our method can be used under the common and realistic setting of distribution shift for selective generation of high-quality outputs, while automatically abstaining from low-quality ones, enabling safer deployment of generative language models.
Supplementary Material: zip
0 Replies
Loading