Abstract: We study language generation in the limit, which was introduced by Kleinberg and Mul-
lainathan [KM24] building on classical works of Gold [Gol67] and Angluin [Ang79]. The re-
sult of [KM24] is an algorithm for generating from any countable language collection in the
limit. While their algorithm eventually generates strings from the target language K, it sacri-
fices breadth, i.e., its ability to output all strings in K. The main open question of [KM24] was
whether this trade-off between consistency and breadth is necessary for language generation.
Recent work by Kalavasis, Mehrotra, and Velegkas [KMV24] proposed three definitions
for consistent language generation with breadth in the limit: generation with exact breadth,
generation with approximate breadth, and unambiguous generation. Concurrent and indepen-
dent work by Charikar and Pabbaraju [CP24a] introduced a different notion, called exhaustive
generation. Both of these works explore when language generation with (different notions of)
breadth is possible.
In this work, we fully characterize language generation for all these notions of breadth and
their natural combinations. Building on [CP24a; KMV24], we give an unconditional lower
bound for generation with exact breadth, removing a technical condition needed in [KMV24]
and extending the unconditional lower bound of [CP24a] which holds for specific collections;
our result shows that generation with exact breadth is characterized by Angluin’s condition
for identification from positive examples [Ang80]. Furthermore, we introduce a weakening of
Angluin’s condition and show that it tightly characterizes both generation with approximate
breadth and exhaustive generation, thus showing that these two notions are equivalent. More-
over, we show that Angluin’s condition further characterizes unambiguous generation in the
limit as a corollary of a more general result that applies to a family of notions of breadth. We
discuss the implications of our results in the statistical setting of Bousquet, Hanneke, Moran,
van Handel, and Yehudayoff [BHMvY21]. Finally, we provide unconditional lower bounds for
stable generators, strengthening the results of [KMV24], and we show that for stable generators
all the aforementioned notions of breadth are characterized by Angluin’s condition. This gives
a separation for generation with approximate breadth, between stable and unstable generators.
Loading