Keywords: LLMs for structured data, numeracy of LLMs, in-context learning, uncertainty estimation, mechanistic interpretability
TL;DR: We demonstrate that LLM hidden states encode rich numerical information, enabling us to recover point predictions and uncertainty estimates of LLMs without autoregressive decoding.
Abstract: Large Language Models (LLMs) have recently been successfully applied to regression tasks—such as time series forecasting and tabular prediction—by leveraging their in-context learning abilities. However, their autoregressive decoding process is ill-suited to continuous-valued outputs, and obtaining numerical predictive distributions typically requires repeated sampling, leading to high computational cost. In this work, we investigate whether distributional properties of LLM predictions (e.g., mean, median, quantiles) can be recovered directly from LLM's internal representations, without explicit autoregressive generation. Our results suggest that LLM embeddings carry informative signals about numerical uncertainty, and that summary statistics of their predictive distributions can be approximated with reduced computational overhead.
Submission Number: 40
Loading