Efficient Global Data Attribution for Diffusion Models

Efficient Global Data Attribution for Diffusion Models

ICLR 2024 Workshop DPFM Submission57 Authors

Published: 04 Mar 2024, Last Modified: 03 May 2024DPFM 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Data Attribution, Diffusion Models

Abstract: With the widespread usage of diffusion models, effective data attribution is needed to ensure fair acknowledgment for contributors of high-quality training samples, and to identify potential sources of harmful content. In this early work, we introduce a novel framework tailored to removal-based data attribution for diffusion models, leveraging sparsified unlearning. This approach significantly improves the computational scalability and effectiveness of removal-based data attribution. In our experiments, we attribute diffusion model FID back to CIFAR-10 training images with datamodel attributions, showing better linear datamodeling score (LDS) than datamodel attributions based on naive retraining.

Submission Number: 57

Loading