Abstract: We present a simple image preprocessing method for training VAEs leading to improved disentanglement compared to directly using the images. In particular, we propose to use regionally aggregated feature maps extracted from CNNs pretrained on ImageNet. Our method achieves the first rank on 3 of 5 metrics on the challenge’s public leaderboard.
0 Replies
Loading