Text-to-Distribution Prediction with Quantile Tokens and Neighbor Context

ACL ARR 2026 January Submission7036 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Quantile Regression, Text Regression, Large Language Models
Abstract: Many applications of LLM-based text regression require predicting a full conditional distribution rather than a single point value. We study \emph{distributional regression} under empirical-quantile supervision, where each input is paired with multiple observed quantile outcomes, and the target distribution is represented by a dense grid of quantiles. We address two key limitations of current approaches: the lack of local grounding for distribution estimates, and the reliance on shared representations that create an indirect bottleneck between inputs and quantile outputs. In this paper, we introduce \emph{Quantile Token Regression}, which, to our knowledge, is the first work to insert dedicated quantile tokens into the input sequence, enabling direct input-output pathways for each quantile through self-attention. We further augment these quantile tokens with retrieval, incorporating semantically similar \emph{neighbor} instances and their empirical distributions to ground predictions with local evidence from similar instances. We also provide the first theoretical analysis of loss functions for quantile regression, clarifying which distributional objectives each optimizes. Experiments on the Inside Airbnb and StackSample benchmark datasets with LLMs ranging from 1.7B to 14B parameters show that quantile tokens with neighbors consistently outperform baselines ($\sim$4 points lower MAPE and 2$\times$ narrower prediction), with especially large gains on smaller and more challenging datasets where quantile tokens produce substantially sharper and more accurate distributions.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: NLP Applications
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Theory
Languages Studied: English
Submission Number: 7036
Loading