Regression (and Scoring) Aware Inference with LLMs

Regression (and Scoring) Aware Inference with LLMs

ACL ARR 2024 June Submission3489 Authors

16 Jun 2024 (modified: 06 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) have shown strong results on a range of applications, including regression and scoring tasks. Typically, one obtains outputs from an LLM via autoregressive sampling from the model’s output distribution. We show that this inference strategy can be sub-optimal for common regression and scoring evaluation metrics. As a remedy, we build on prior work on Minimum Bayes Risk decoding, and propose alternate inference strategies for regression and scoring that estimate the Bayes-optimal solution for the given metric in closed-form from sampled responses. We show that our proposal yields significant improvements over baselines across datasets and models.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: MBR, regression, scoring, ranking, LLM

Contribution Types: NLP engineering experiment, Theory

Languages Studied: English

Submission Number: 3489

Loading