Nethira: A Heterogeneity-aware Hierarchical Pre-trained Model for Network Traffic Classification

Chungang Lin, Weiyao Zhang, Haitong Luo, Xuying Meng, Yujun Zhang

Published: 2026, Last Modified: 13 Mar 2026CoRR 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Network traffic classification is vital for network security and management. The pre-training technology has shown promise by learning general traffic representations from raw byte sequences, thereby reducing reliance on labeled data. However, existing pre-trained models struggle with the gap between traffic heterogeneity (i.e., hierarchical traffic structures) and input homogeneity (i.e., flattened byte sequences). To address this gap, we propose Nethira, a heterogeneity-aware pre-trained model based on hierarchical reconstruction and augmentation. In pre-training, Nethira introduces hierarchical reconstruction at multiple levels-byte, protocol, and packet-capturing comprehensive traffic structural information. During fine-tuning, Nethira proposes a consistency-regularized strategy with hierarchical traffic augmentation to reduce label dependence. Experiments on four public datasets demonstrate that Nethira outperforms seven existing pre-trained models, achieving an average F1-score improvement of 9.11%, and reaching comparable performance with only 1% labeled data on high-heterogeneity network tasks.

External IDs:dblp:journals/corr/abs-2601-22494