Best Prompts for Text-to-Image Models and How to Find Them

Nikita Pavlichenko, Dmitry Ustalov

Published: 17 Jul 2023, Last Modified: 23 Apr 2026SIGIR '23EveryoneCC BY 4.0

Abstract: Advancements in text-guided diffusion models have allowed for the creation of visually appealing images similar to those created by professional artists. The effectiveness of these models depends on the composition of the textual description, known as the prompt, and its accompanying keywords. Evaluating aesthetics computationally is difficult, so human input is necessary to determine the ideal prompt formulation and keyword combination. In this study, we propose a human-in-the-loop method for discovering the most effective combination of prompt keywords using a genetic algorithm. Our approach demonstrates how this can lead to an improvement in the visual appeal of images generated from the same description.