Language-Guided Artistic Style Transfer Using the Latent Space of DALL-EDownload PDF

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Language-Guided Style Transfer, Non-Autoregressive Transformer, Deep Reinforcement Learning
Abstract: Despite the progress made in the style transfer task, most of the previous work focuses on transferring only relatively simple features like color or texture, while missing other more abstract and creative concepts such as the specific artistic trait of the painter or the overall feeling of the scene. However, these more abstract concepts can be captured by the semantics of the latent space of models like DALL-E or CLIP, which have been trained using huge datasets of images and textual documents. In this paper, we propose a style transfer method that exploits both of these models and uses the natural language to describe abstract artistic styles. Specifically, we formulate the language-guided style transfer task as a non-autoregressive token sequence translation in the DALL-E discrete latent space. Moreover, we propose a textual-prompt-based Reinforcement Learning strategy to incorporate style-specific information in the translation network using the CLIP space as the only guidance. Our empirical results show that we can transfer artistic styles using language instructions at different granularities on content images that are not restricted to a specific domain. Our code will be publicly available.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)
TL;DR: We propose a language-guided style transfer method that manipulates the discrete DALL-E latent space using a non-autoregressive sequence translation approach.
Supplementary Material: zip
5 Replies

Loading