Multi-Scale Contrastive Attention Representation Learning for Encrypted Traffic Classification

Shuo Yang, Xinran Zheng, Jinze Li, Jinfeng Xu, Edith C. H. Ngai

Published: 2024, Last Modified: 21 Jan 2026CIKM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Encrypted traffic classification is essential for network security and management. However, the encrypted nature makes it challenging to extract representative features from raw traffic data. Existing end-to-end methods ignore byte correlations within packets and potential correlations among packets, hindering the learning of real traffic semantics and leading to suboptimal performance. This paper proposes MsETC, a multi-scale contrastive attention representation learning method for encrypted traffic classification. MsETC divides the raw packet byte sequence into multi-scale patches and then extracts dual views for contrastive learning from both the inter-patch and intra-patch perspectives. This allows the model to capture correlations among bytes within a packet as well as the potential interactions between packets. Extensive experiments on real-world datasets demonstrate that the proposed method achieves superior classification performance with lower complexity.