Rethinking the Truly Unsupervised Image-to-Image TranslationDownload PDF

28 Sept 2020 (modified: 22 Oct 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: unsupervised approach, image-to-image translation, representation learning
Abstract: Every recent image-to-image translation model uses either image-level (i.e. input-output pairs) or set-level (i.e. domain labels) supervision at a minimum. However, even the set-level supervision can be a serious bottleneck for data collection in practice. In this paper, we tackle image-to-image translation in a fully unsupervised setting, i.e., neither paired images nor domain labels. To this end, we propose a truly unsupervised image-to-image translation model (TUNIT) that simultaneously learns to separate image domains and translate input images into the estimated domains. Experimental results show that our model achieves comparable or even better performance than the set-level supervised model trained with full labels, generalizes well on various datasets, and is robust against the choice of hyperparameters (e.g. the preset number of pseudo domains). In addition, TUNIT extends well to the semi-supervised scenario with various amount of labels provided.
One-sentence Summary: We propose a truly unsupervised image-to-image translation model even without set-level supervisions.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2006.06500/code)
Reviewed Version (pdf): https://openreview.net/references/pdf?id=YDfDgbWlRS
13 Replies

Loading