Better autoregressive regression with LLMs

Published: 22 Jan 2025, Last Modified: 11 Feb 2025ICLR 2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: regression, LLMs
Abstract: Large language models (LLMs) have proven successful on many machine learning tasks, including those that do not involve language generation. In specific, LLMs have been shown to be effective in solving regression, where the targets are real-numbers. One common approach is to fine tune the LLM based on the log-perplexity loss and use autoregressive sampling at the inference time. Another approach relies on adding a predictive head and finetuning it with a suitable loss. Despite the success, there has not been a study on the principled ways of using decoder LLMs for regression. In this work we compare different prior works under a unified view, and introduce RAFT, regression-aware fine-tuning, a novel approach based on the Bayes-optimal decision rule. We demonstrate how RAFT improves over established baselines on several benchmarks and model families.
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11905
Loading