Enhancing Reinforcement Learning Finetuned Text-to-Image Generative Model Using Reward Ensemble

Published: 01 Jan 2024, Last Modified: 21 May 2025ITS (2) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In recent years, advanced diffusion models have shown good performance in converting text prompts into high-quality images. However, aligning the generated images to human preferences remains challenging due to the biases in training. Previous researches have attempted to address this problem by incorporating reinforcement learning and human feedback into the denoising diffusion models. However, such approaches often encounter over-optimization, commonly referred to as the reward hacking problem. This paper introduces a simple and effective ensemble approach that combines multiple reward models to optimize the overall reward structure. This proposed method successfully overcomes the over-optimization problem in the diffusion model’s fine-tuning process. Both quantitative and qualitative results demonstrate the effectiveness of the proposed approach to generate an image that is a more realistic representation.
Loading