Autoencoders with Intrinsic Dimension Constraints for Learning Low Dimensional Image Representations

Jianzhang Zheng; Hao Shen; Xuan Tang; Mingsong Chen; peidong liang; Xian Wei

Autoencoders with Intrinsic Dimension Constraints for Learning Low Dimensional Image Representations

Jianzhang Zheng, Hao Shen, Xuan Tang, Mingsong Chen, peidong liang, Xian Wei

19 Sept 2023 (modified: 14 Nov 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Self-supervised Representation Learning, Intrinsic Dimension

TL;DR: We propose a novel framework which incorporates global and local Intrinsic Dimension (ID) constraints into autoencoders, enhancing feature discriminability and improving downstream task performance on downstream tasks..

Abstract: Autoencoders have achieved great success in various computer vision applications. The autoencoder learns appropriate low-dimensional image representations through the self-supervised paradigm, i.e., reconstruction. Existing studies mainly focus on minimizing the pixel-level reconstruction error of an image, while mostly ignoring the preservation of the property that reveals the manifold structure of data, such as Intrinsic Dimension (ID). The learning process of most autoencoders is observed to involve dimensionality compression first, and then dimensionality expansion, which plays a crucial role in acquiring low-dimensional image representations. Motivated by the important role of ID, in this work, we propose a novel deep representation learning approach with autoencoder, which incorporates regularization of the global and local ID constraints into the reconstruction of data representations. This approach not only preserves the global manifold structure of the whole dataset but also maintains the local manifold structure of the feature maps of each point, which makes the learned low-dimensional features more discriminant and improves the performance of the downstream tasks. To the best of our knowledge, existing works are rare and limited in exploiting both global and local ID invariant properties on the regularization of DNNs. Numerical experimental results on benchmark datasets (Extended Yale B, Caltech101 and ImageNet) show that the resulting regularized learning models achieve better discriminative representations for downstream tasks including image classification and clustering.

Supplementary Material: pdf

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1711

Loading