On Union-Closedness of Language Generation

Steve Hanneke; Amin Karbasi; Anay Mehrotra; Grigoris Velegkas

On Union-Closedness of Language Generation

Steve Hanneke, Amin Karbasi, Anay Mehrotra, Grigoris Velegkas

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: language generation, union-closedness, online learning, learning theory

TL;DR: We show that language generation in the limit is not closed when we take finite unions of collections of languages.

Abstract: We investigate language generation in the limit – a model by Kleinberg and Mullainathan and extended by Li, Raman, and Tewari. While Kleinberg and Mullainathan proved generation is possible for all countable collections, Li, Raman, and Tewari defined a hierarchy of generation notions (uniform, non-uniform, and generatable) and explored their feasibility for uncountable collections. Our first set of results resolve two open questions of Li et al. by proving finite unions of generatable or non-uniformly generatable classes need not be generatable. These follow from a stronger result: there is non-uniformly generatable class and a uniformly generatable class whose union is non-generatable. This adds to the aspects along which language generation in the limit is different from traditional tasks in statistical learning theory like classification, which are closed under finite unions. In particular, it implies that given two generators for different collections, one cannot combine them to obtain a single "more powerful" generator, prohibiting this notion of boosting. Our construction also addresses a third of Li et al.'s open questions on whether there are uncountable classes that are non-uniformly generatable and do not satisfy the eventually unbounded closure (EUC) condition introduced by Li et al. Our approach utilizes carefully constructed classes along with a novel diagonalization argument that could be of independent interest in the growing area of language generation.

Supplementary Material: zip

Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)

Submission Number: 18302

Loading