- Original Pdf: pdf
- Keywords: unsupervised, representation learning, Laplacian
- TL;DR: We propose a new denoising autoencoder with Laplacian pyramid editing, results in improved representation learning capability.
- Abstract: While deep neural networks have been shown to perform remarkably well in many machine learning tasks, labeling a large amount of supervised data is usually very costly to scale. Therefore, learning robust representations with unlabeled data is critical in relieving human effort and vital for many downstream applications. Recent advances in unsupervised and self-supervised learning approaches for visual data benefit greatly from domain knowledge. Here we are interested in a more generic unsupervised learning framework that can be easily generalized to other domains. In this paper, we propose to learn data representations with a novel type of denoising autoencoder, where the input noisy data is generated by corrupting the clean data in gradient domain. This can be naturally generalized to span multiple scales with a Laplacian pyramid representation of the input data. In this way, the agent has to learn more robust representations that can exploit the underlying data structures across multiple scales. Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach, compared to its counterpart with single-scale corruption. Besides, we also demonstrate that the learned representations perform well when transferring to other vision tasks.