Diffusion Domain Expansion: Learning to Coordinate Pre-Trained Diffusion Models

Egor Lifar; Semyon Savkin; Timur Garipov; Shangyuan Tong; Tommi Jaakkola

Diffusion Domain Expansion: Learning to Coordinate Pre-Trained Diffusion Models

Egor Lifar, Semyon Savkin, Timur Garipov, Shangyuan Tong, Tommi Jaakkola

Published: 17 Jun 2024, Last Modified: 19 Jul 20242nd SPIGM @ ICML PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion models, compositional models, model coordination, music generation, conditional image generation

TL;DR: TL;DR: we propose to expand the generative domain of pre-trained diffusion models by training a coordinator that reconciles outputs of pre-trained denoisers

Abstract: In this paper, we propose Diffusion Domain Expansion (DDE), a method that efficiently extends pre-trained diffusion models to generate larger objects and handle more complex conditioning beyond their original capabilities. Our method employs a compact trainable network designed to coordinate the denoised outputs of pre-trained diffusion models. We demonstrate that the coordinator can be universally simple while being capable of generalizing to domains larger than those observed during its training time. We evaluate DDE on long audio track generation and conditional image generation, demonstrating its applicability across domains. DDE outperforms other approaches to coordinated generation with diffusion models in qualitative and quantitative evaluations.

Submission Number: 80

Loading