List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression

Joseph Rowan; Buu Phan; Ashish J Khisti

List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression

Joseph Rowan, Buu Phan, Ashish J Khisti

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: probability, sampling, compression, speculative decoding, multi-draft speculative sampling, large language models

TL;DR: We introduce a technique for coupling probability distributions when several samples are available from one of the distributions, and give applications to multi-draft speculative decoding and distributed lossy compression with side information.

Abstract: We study a relaxation of the problem of coupling probability distributions — a list of samples is generated from one distribution and an *accept* is declared if any one of these samples is identical to the sample generated from the other distribution. We propose a novel method for generating samples, which extends the Gumbel-max sampling suggested in Daliri et al. (2025) for coupling probability distributions. We also establish a corresponding lower bound on the acceptance probability, which we call the \emph{list matching lemma}. We next discuss two applications of our setup. First, we develop a new mechanism for multi-draft speculative sampling that is simple to implement and achieves performance competitive with baselines such as SpecTr and SpecInfer across a range of language tasks. Our method also guarantees a certain degree of *drafter invariance* with respect to the output tokens which is not supported by existing schemes. We also provide a theoretical lower bound on the token level acceptance probability. As our second application, we consider distributed lossy compression with side information in a setting where a source sample is compressed and available to multiple decoders, each with independent side information. We propose a compression technique that is based on our generalization of Gumbel-max sampling and show that it provides significant gains in experiments involving synthetic Gaussian sources and the MNIST image dataset.

Supplementary Material: zip

Primary Area: Probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)

Submission Number: 18712

Loading