Abstract: Due to the semantic entanglement in fusion strategies or unstable training in complicated image transformations, existing few-shot image generation methods still suffer from low generation quality and diversity. To tackle the above problems, we propose a novel fusion- and transformation-based framework named content Fusion with style Transformation Generative Adversarial Network (FiTGAN) for few-shot image generation. The basic assumption is that any image consists of a collection of content-related and style-related features. FiTGAN disentangles internal representations with two independent encoders and combines the fused contents and transformed styles to generate new images. Specifically, we design a multi-scale content fusion strategy and a reparameterized style transformation mechanism to learn more fine-grained semantics without changing category-relevant attributes. Furthermore, we formulate a content reconstruction loss and a style divergence loss to provide better training stability and generation performance. Comprehensive experiments on three well-known datasets demonstrate that FiTGAN can not only produce more realistic and diverse images for few-shot image generation but also achieve better classification accuracy for downstream visual applications with limited data.
External IDs:dblp:conf/icassp/ZhouZ0YW025
Loading