Fast Multi-Mode Adaptive Generative Distillation for Continually Learning Diffusion Models

Rui Yang; Matthieu Grard; Emmanuel Dellandrea; Liming Chen

Fast Multi-Mode Adaptive Generative Distillation for Continually Learning Diffusion Models

Rui Yang, Matthieu Grard, Emmanuel Dellandrea, Liming Chen

23 Sept 2024 (modified: 14 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion model, continual learning, transfer learning

TL;DR: We enhance diffusion models for continual learning by proposing methods that maintain image quality, reduce computational costs, and ensure balanced image generation across classes.

Abstract: Diffusion models are powerful generative models, but their computational demands, vulnerability to catastrophic forgetting, and class imbalance in generated data pose significant challenges in continual learning scenarios. In this paper, we introduce Fast Multi-Mode Adaptive Generative Distillation (MAGD), a novel approach designed to address these three core challenges. MAGD combines generative replay and knowledge distillation, enhancing the continual training of diffusion models through three key innovations: (1) Noisy Intermediate Generative Distillation (NIGD), which leverages intermediate noisy images during the reverse diffusion process to improve data utility and preserve image quality without additional computational costs; (2) Class-guided generative distillation (CGGD), which uses classifier guidance to ensure balanced class representation in generated images, addressing the issue of class imbalance in traditional methods; and (3) Signal-Guided Generative Distillation (SGGD), which reduces computational overhead while maintaining image clarity through the reuse of the model’s denoising capabilities across tasks. Our experimental results on Fashion-MNIST, CIFAR-10, and CIFAR-100 demonstrate that MAGD significantly outperforms existing methods in both image quality, measured by Fréchet Inception Distance (FID), and class balance, measured by Kullback-Leibler Divergence (KLD). Moreover, MAGD achieves competitive results with far fewer generation steps compared to traditional methods, making it a practical solution for real-life continual learning applications.

Primary Area: transfer learning, meta learning, and lifelong learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3101

Loading