Abstract: Even when decoding with temperature $T=0$, large language models (LLMs) can produce divergent outputs for identical inputs. Recent works align in highlighting implementation-level sources of nondeterminism, including batch-size variation, kernel non-invariance, and floating-point non-associativity. In this work, we formalize this behavior by introducing the notion of background temperature $T_{\mathrm{bg}}$, the effective temperature induced by an implementation-dependent perturbation process observed even when nominal $T=0$. We provide clean definitions, show how $T_{\mathrm{bg}}$ relates to a stochastic perturbation governed by the inference environment $I$, and propose an empirical protocol to estimate $T_{bg}$ via the equivalent temperature $T_n(I)$ of an ideal reference system. We conclude with a set of pilot experiments run on a representative pool from the major LLM providers that demonstrate the idea and outline implications for reproducibility, evaluation, and deployment.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: The new version of the paper has been extensively revised following the reviewers’ feedback and incorporating new material that strengthens the overall contribution. Several additional experiments have been designed and conducted during the review period, further supporting the validity and robustness of the background temperature concept. An additional reference model has been introduced to improve the estimation of $T_{bg}$, together with two supplementary measures of variability. Additional analyses have also been performed to investigate the impact of the time at which the estimate is performed and to validate the background temperature estimates using a reference model with a known sampling temperature.
Moreover, the theoretical component of the work has been substantively enhanced through a rigorous mathematical reformulation of $T_{bg}$, including a clearer and more formal description of the estimation procedure. The introduction and conclusions have also been revised and improved to better reflect the expanded set of results and their implications.
Assigned Action Editor: ~Yonatan_Bisk1
Submission Number: 6133
Loading