Abstract: Over the past few years, several studies have been conducted on text-to-image synthesis techniques, which transfer input textual descriptions into realistic images. However, facial image synthesis and manipulation from input sentences have not been widely explored due to the lack of datasets. My research interests center around the development of multi-modality technology and facial image generation with Generative Adversarial Networks. Towards that end, we propose an approach for facial image generation and manipulation from text descriptions. We also introduce the first Text-to-Face synthesis dataset with large-scale facial attributes. In this extended abstract, we first present the existing condition and further direction of my Ph.D. research that I have followed during the first year. Then, we introduce the proposed method (accepted by IEEE FG2021), annotated novel dataset and experimental results. Finally, the future outlook on other challenges, proposed dataset and expected impact are discussed. Codes and paper lists studied in text-to-image synthesis are summarized on https://github.com/Yutong-Zhou-cv/Awesome-Text-to-Image.
0 Replies
Loading