Abstract: Accurate network traffic classification is of urgent need in the big data era, as the anomalous network traffic becomes formidable to classify in the nowadays complicated network environment. Deep Learning (DL) techniques can master in detecting anomalous data due to the capability of fitting training data. However, this capability lay on the correctness of the training data, which also made them sensitive to annotation errors. We propose that by measuring the uncertainty of the model, annotation errors can be accurately corrected for classifying network traffic. We use dropout to approximate the prior distribution and calculate Mutual Information (MI) and Softmax Variance (SV) of the output. In this paper, we present a framework named Uncertainty Based Annotation Error Correction(UAEC) based on both MI and SV, whose accuracy outperforms other proposed methods. By modifying the labels of a public dataset, a real-life annotation scenario is simulated. Based on the regenerated dataset, we compare the detection effectiveness of Euclidean Distance, MI, SV, and UAEC. As demonstrated in the experiment, by using UAEC, an averaging 47.92% increase in the detection accuracy is attained.
0 Replies
Loading