Regression (and Scoring) Aware Inference with LLMs

ACL ARR 2024 June Submission3489 Authors

16 Jun 2024 (modified: 06 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) have shown strong results on a range of applications, including regression and scoring tasks. Typically, one obtains outputs from an LLM via autoregressive sampling from the model’s output distribution. We show that this inference strategy can be sub-optimal for common regression and scoring evaluation metrics. As a remedy, we build on prior work on Minimum Bayes Risk decoding, and propose alternate inference strategies for regression and scoring that estimate the Bayes-optimal solution for the given metric in closed-form from sampled responses. We show that our proposal yields significant improvements over baselines across datasets and models.
Paper Type: Short
Research Area: NLP Applications
Research Area Keywords: MBR, regression, scoring, ranking, LLM
Contribution Types: NLP engineering experiment, Theory
Languages Studied: English
Submission Number: 3489
Loading