Personalized Preference Optimization for Text-to-Image Generation using Large Language Models

ACL ARR 2024 June Submission5591 Authors

16 Jun 2024 (modified: 03 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Preference optimization is a crucial aspect of generative models, ensuring that the generated content aligns with users' preferences. While previous research has focused on optimizing for average preferences, text-to-image tasks require a personalized approach due to the diversity of individual preferences. In this study, we propose a two-stage framework for personalized preference optimization in text-to-image generation. The first stage, personalized image aesthetic assessment (PIAA), learns user preferences from a small amount of user image rating data. The second stage, prompt optimization, optimizes the text-to-image model's prompt to generate images that receive high scores from the learned preference model. We employ Large Language Models (LLMs) for the prompt optimization process. Through extensive experimentation with various configurations in the PIAA and prompt optimization stages, we demonstrate that our approach can generate novel images that align with individual user preferences, even with limited user data. Our research lays the foundation for future work on personalized content generation.
Paper Type: Short
Research Area: Generation
Research Area Keywords: cross-modal content generation
Languages Studied: English
Submission Number: 5591
Loading