A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective

Published: 11 Jun 2025, Last Modified: 13 Jul 2025MemFMEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion model, memorization, generative model
Abstract: This paper identifies a transition from generalization to memorization over the recursive training of diffusion models, providing a novel perspective for the study of model collapse. Specifically, the models increasingly replicate training data instead of generating novel content during iterative training on self-generated samples. This transition is directly driven by the declining entropy of the synthetic training data produced in each training cycle, which serves as a clear indicator of model degradation. Motivated by this insight, we propose an entropy-based data selection strategy to mitigate the transition from generalization to memorization and quality degradation. Empirical results show that our approach significantly enhances visual quality and diversity in recursive generation, effectively preventing model collapse.
Submission Number: 24
Loading