Keywords: Sequential Disentanglement, Deep Learning, Generative Models
Abstract: Unsupervised representation learning, in particular, sequential disentanglement, where the goal is to learn disentangled static and dynamic factors of variation, remains a significant challenge due to the absence of labels. Existing models, based on variational autoencoders and generative adversarial networks, achieved success in certain domains, but they often struggle with disentangling sequences, especially when dealing with real-world complexity and variability. Further, there is no real-world evaluation protocol for assessing the effectiveness of sequential disentanglement models. Recently, diffusion autoencoders have emerged as a new promising generative model, offering semantically rich representations by gradual noise-to-data transformations. Despite their advantages, these models face limitations: they are non-sequential, fail to disentangle the latent space effectively, and are computationally intensive, making them difficult to scale to sequences. In this work, we introduce our diffusion sequential disentanglement autoencoder (DiffSDA), a novel approach effective on real-world visual data and accompanied by a new and challenging evaluation protocol. DiffSDA is based on a new probabilistic modeling and is implemented using latent diffusion models and efficient samplers, facilitating processing of high-resolution videos. We test our approach on several real-world datasets and metrics, and we demonstrate its effectiveness in comparison to recent state-of-the-art sequential disentanglement methods.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6056
Loading