Reinforcement Learning based Image Generation via Visual Consensus Evaluation

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Image Synthesis; Reinforcement Learning; Diffusion Model
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Image generation models are typically trained using the L2 or cross-entropy loss, and evaluated using IS or FID. The inconsistency between the training and evaluation metrics results in suboptimal model performance. To this end, we explore to address the aforementioned issue by finetuning pre-trained generative models with the reinforcement learning. Considering that current evaluation metrics can not be used as training objects since obtaining an accurate score typically demands more than ten thousand images, we introduce an innovative automated metric that captures consensus as a reward signal of the reinforcement learning for finetuning image generation models. It exhibits strong correlation with commonly used metrics such as FID, and demonstrates better robustness to the number of images than FID. Experiments indicate that when introducing varying degrees of noise to the generated images, such as ImageNet contamination or Gaussian noise, our metric quantifies the level of disruption more accurately than IS. By finetuning generative models with our proposed method, we boost the performance for image generation on multiple benchmarks like LSUN 256x256 and ImageNet 64x64.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2528
Loading