Keywords: self-referential learning, model collapse, entropy reservoir, Bregman projection, information geometry, generative AI
TL;DR: Self-training is a stochastic Bregman-projection loop whose entropy inevitably vanishes unless it is continuously mixed with a high-entropy reservoir—an insight that unifies and explains all existing anti-collapse tricks.
Abstract: Self-referential learning---training a model on data it generated itself---promises
boundless scalability but chronically suffers from \emph{model collapse}: language
models degenerate into repetitive text, GANs drop modes, and reinforcement-learning
policies over-exploit. Although practitioners employ ad~hoc fixes such as real-data
mixing, entropy bonuses, knowledge distillation, or retrieval-augmented generation,
a single principle that explains both the failure mode and the success of these
fixes has remained elusive.
We present \textbf{Entropy-Reservoir Bregman Projection} (ERBP), an
information-geometric framework that unifies these phenomena. We model the closed
loop as a stochastic Bregman projection sequence in distribution space. Without
external coupling, finite-sample noise forces the system to project onto an
ever-shrinking empirical support, causing exponential entropy decay and eventual
collapse. Introducing an \emph{Entropy Reservoir}---a high-entropy distribution
mixed into each projection---injects a controllable entropy flux that provably
stabilises the dynamics.
Our theory yields (i) a necessary condition for collapse, (ii) a sufficient
condition that guarantees a non-trivial entropy floor, and (iii) closed-form rates
that depend only on sample size and the strong-convexity/Lipschitz constants of
the Bregman generator. Experiments on large-language-model self-training, Soft
Actor-Critic in reinforcement learning, and GAN optimisation validate our
predictions and show that disparate stabilisation heuristics correspond to
specific reservoir choices and coupling coefficients. ERBP thus transforms a
collection of folk remedies into a single, quantitative design rule: monitor and
budget your entropy flux.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 24491
Loading