Keywords: representation learning, disentanglement, real-world
Abstract: Learning meaningful representations that disentangle the underlying structure of the data generating process is considered to be of key importance in machine learning. While disentangled representations were found to be useful for diverse tasks such as abstract reasoning and fair classification, their scalability and real-world impact remain questionable. We introduce a new high-resolution dataset with 1M simulated images and over 1,800 annotated real-world images of the same setup. In contrast to previous work, this new dataset exhibits correlations, a complex underlying structure, and allows to evaluate transfer to unseen simulated and real-world settings where the encoder i) remains in distribution or ii) is out of distribution. We propose new architectures in order to scale disentangled representation learning to realistic high-resolution settings and conduct a large-scale empirical study of disentangled representations on this dataset. We observe that disentanglement is a good predictor for out-of-distribution (OOD) task performance.
One-sentence Summary: We scale disentangled representation learning to a new realistic dataset and conduct a large-scale empirical study on OOD generalization.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Data: [Sprites](https://paperswithcode.com/dataset/sprites), [dSprites](https://paperswithcode.com/dataset/dsprites), [smallNORB](https://paperswithcode.com/dataset/smallnorb)