FlowRefiner: A Robust Traffic Classification Framework against Label Noise

Mingwei Zhan; Ruijie Zhao; Xianwen Deng; Zhi Xue; Qi Li; Zhuotao Liu; Guang Cheng; Ke Xu

FlowRefiner: A Robust Traffic Classification Framework against Label Noise

Mingwei Zhan, Ruijie Zhao, Xianwen Deng, Zhi Xue, Qi Li, Zhuotao Liu, Guang Cheng, Ke Xu

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Traffic Classification, Label Noise Learning, Robust Network Traffic Analysis

Abstract: Network traffic classification is essential for network management and security. In recent years, deep learning (DL) algorithms have emerged as essential tools for classifying complex traffic. However, they rely heavily on high-quality labeled training data. In practice, traffic data is often noisy due to human error or inaccurate automated labeling, which could render classification unreliable and lead to severe consequences. Although some studies have alleviated the label noise issue in specific scenarios, they are difficult to generalize to general traffic classification tasks due to the inherent semantic complexity of traffic data. In this paper, we propose FlowRefiner, a robust and general traffic classification framework against label noise. FlowRefiner consists of three core components: a traffic semantics-driven noise detector, a confidence-guided label correction mechanism, and a cross-granularity robust classifier. First, the noise detector utilizes traffic semantics extracted from a pre-trained encoder to identify mislabeled flows. Next, the confidence-guided label correction module fine-tunes a label predictor to correct noisy labels and construct refined flows. Finally, the cross-granularity robust classifier learns generalized patterns of both flow-level and packet-level, improving classification robustness against noisy labels. We evaluate our method on four traffic datasets with various classification scenarios across varying noise ratios. Experimental results demonstrate that FlowRefiner mitigates the impact of label noise and consistently outperforms state-of-the-art baselines by a large margin. The code is available at https://github.com/NSSL-SJTU/FlowRefiner.

Supplementary Material: zip

Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)

Submission Number: 11434

Loading