Abstract: Latent diffusion models excel at producing high-quality images from text. Yet, concerns appear about the lack of diversity in the generated imagery. To tackle this, we introduce Diverse Diffusion, a method for boosting image diversity beyond gender and ethnicity, spanning into richer realms, including color diversity.
Diverse Diffusion is a general unsupervised technique that can be applied to existing text-to-image models. Our approach focuses on finding vectors in the Stable Diffusion latent space that are distant from each other. We generate multiple vectors in the latent space until we find a set of vectors that meets the desired distance requirements and the required batch size.
To evaluate the effectiveness of our diversity methods, we conduct experiments examining various characteristics, including color diversity, LPIPS metric, and ethnicity/gender representation in images featuring humans.
The results of our experiments emphasize the significance of diversity in generating realistic and varied images, offering valuable insights for improving text-to-image models. Through the enhancement of image diversity, our approach contributes to the creation of more inclusive and representative AI-generated art.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Latent vectors definition is clarified.
LPIPS vs LPIPS loss difference is clarified
Citations are moved to the claim “ While they have shown impressive results in generating high-quality images from textual descriptions, there have been concerns regarding potential causes of their usage for various sensitive applications.”
Clarification is added that the distance is Euclidean
Choice of k / minimal number of computation cycles/ prompt variations are discussed.
Assigned Action Editor: ~Simon_Kornblith1
Submission Number: 1956
Loading