- Keywords: representation learning, unsupervised learning, anomaly detection, clustering
- TL;DR: This paper introduces a novel Random Distance Prediction model to learn expressive feature representations in a fully unsupervised fashion by predicting random distances, enabling substantially improved anomaly detection and clustering performance.
- Abstract: Deep neural networks have gained tremendous success in a broad range of machine learning tasks due to its remarkable capability to learn semantic-rich features from high-dimensional data. However, they often require large-scale labelled data to successfully learn such features, which significantly hinders their adaption into unsupervised learning tasks, such as anomaly detection and clustering, and limits their applications into critical domains where obtaining massive labelled data is prohibitively expensive. To enable downstream unsupervised learning on those domains, in this work we propose to learn features without using any labelled data by training neural networks to predict data distances in a randomly projected space. Random mapping is a highly efficient yet theoretical proven approach to obtain approximately preserved distances. To well predict these random distances, the representation learner is optimised to learn class structures that are implicitly embedded in the randomly projected space. Experimental results on 19 real-world datasets show our learned representations substantially outperform state-of-the-art competing methods in both anomaly detection and clustering tasks.
- Original Pdf: pdf