Abstract: Large generative language models (GLM) provide a versatile tool for solving a wide variety of natural processing tasks. GLM responses, though, are provided in the form of text, without an indication of the model's confidence in the answer. This limits the usability of these models on high-risk applications where decisions made based on an incorrect answer can have severe consequences. In this work, we focus on the problem of generating reliable class posterior distributions for text classification tasks, which can be used both for decision making and for producing interpretable confidence scores for the user. We show that the naive approach for computing posteriors based on the token posteriors produced by the GLM results in extremely poor posteriors. We then explore different adaptation approaches for improving the quality of posteriors, focusing on low resource scenarios where a small amount of data is available for adaptation. We show that parameter-efficient supervised fine-tunning (SFT), while providing large gains in terms of decision quality, produces suboptimal posteriors due to overfitting. To address this problem, we propose an approach that combines SFT and post-hoc calibration using a three-stage training strategy, improving the quality of both posteriors and categorical decisions.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Hsuan-Tien_Lin1
Submission Number: 5498
Loading