Steering Large Text-to-Image Model for Kandinsky Synthesis Through Preference-Based Prompt Optimization

Aven Le Zhou, Wei Wu, Yu-Ao Wang, Kang Zhang

Published: 01 Jan 2025, Last Modified: 21 Jul 2025EvoMUSART 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the advancement of neural generative capabilities, the art community has increasingly embraced GenAI (Generative Artificial Intelligence), particularly large text-to-image models, for producing aesthetically compelling results. However, the process often lacks determinism and requires a tedious trial-and-error process, as users frequently struggle to devise effective prompts to achieve their desired outcomes. This paper introduces a prompting-free generative approach that applies a genetic algorithm and real-time iterative human feedback to optimize prompt generation, enabling the creation of user-preferred abstract art, e.g., Kandinsky’s Bauhaus style. The proposed two-part approach begins with constructing an Artist Model capable of deterministically generating Kandinsky paintings. The second phase integrates real-time user feedback to optimize prompt generation and obtains an “Optimized Prompting Model,” which adapts to user preferences and automatically generates prompts. Combined with the Artist Model, this approach allows users to create Kandinsky tailored to their preferences.