Probability Correlation Learning for Anomaly Detection based on Distribution-Constrained Autoencoder

Jihua Wu, Lei Zhang, Cong Liu, Qi Qi, Jingyu Wang, Tong Xu, Jianxin Liao

Published: 01 Jan 2022, Last Modified: 08 Apr 2025APNOMS 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Network anomaly detection provides a reliable and stable service to detect faults and prevent security attacks effectively. However, existing detection methods still encounter many challenges. The supervised learning method is unsuitable because the anomaly samples are seriously sparse and hard to label. Unsupervised learning, as a promising method, is widely used while the discriminative features are ignored when reconstructing from the normal feature space. This paper proposes a novel probability correlation learning based on autoencoder called PCDetect, a semi-supervised learning method. Since we assumed the anomaly samples deviate from the distribution of normal samples, approximating the distribution of original data is proposed as an efficient preprocessing methodology to capture the discriminative features. Moreover, an encoder-decoder neural network associated with the proposed loss function is designed to learn the low-dimensional feature representation from raw data and constrain the latent representation to follow different referenced distributions based on a few anomaly labels. In this way, The correlation of the referenced distribution and the reconstruction of latent representation will be used to quantify the probability of anomaly. Extensive experiments are conducted on two public real-world datasets, NSL-KDD and UNSW-NB15. Results show the proposed PCDetect can efficiently cope with the imbalance and high-dimensional issues compared with several popular supervised learning methods, significantly improve the accuracy, and reduce the false rate as a whole compared with unsupervised learning.