Pareto-optimal Non-uniform Language Generation

Published: 18 Dec 2025, Last Modified: 21 Feb 2026ALT 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Language Generation in the Limit, Non-uniform Language Generation, Pareto-optimality
TL;DR: We study Pareto-optimality for non-uniform language generation in the limit, giving algorithms that are (almost) Pareto-optimal in different settings.
Abstract: Kleinberg and Mullainathan (2024) recently proposed an interesting model for language generation in the limit: Given a countable collection of languages, and an adversary enumerating the strings of some language $L$ from the collection, the objective is to generate _new_ strings from the target language, such that all strings generated beyond some finite time are valid. Li, Raman, and Tewari (2024) and Charikar and Pabbaraju (2024) showed strong _non-uniform_ generation guarantees in this model, giving algorithms that generate new valid strings from $L$ after seeing a number of distinct input strings $t(L)$ that depends only on $L$ (and the collection), but _not_ the enumeration order. However, for both these works, the language-wise _generation times_ $t(L)$ of the algorithm can be strictly sub-optimal. In this work, we study _Pareto-optimality_ of non-uniform language generation in the limit. We propose an algorithm, whose generation times $t^\star(L)$ are (almost) Pareto-optimal: any other algorithm whose generation time for some language $L$ is strictly smaller than $t^\star(L)$, _must satisfy_ that its generation time for some _other_ language $L'$ is strictly worse than $t^\star(L')$. Pareto-optimality is essentially the best that one can achieve for non-uniform generation. Our algorithmic framework conveniently adapts to further give Pareto-optimal non-uniform generation algorithms in the practically motivated settings of _noisy_ as well as _representative_ generation.
Submission Number: 150
Loading