Abstract: Network traffic refers to the amount of data being sent and received over the Internet or any system that connects computers. Analyzing network traffic is vital for security and management, yet remains challenging due to the heterogeneity of plain-text packet headers and encrypted payloads. To capture the latent semantics of traffic, recent studies have adopted Transformer-based pretraining techniques to learn network representations from massive traffic data. However, these methods pre-train on data-driven tasks but overlook network knowledge, such as masking partial digits of the indivisible network port numbers for prediction, thereby limiting semantic understanding. In addition, they struggle to extend classification to new classes during fine-tuning due to the distribution shift. Motivated by these limitations, we propose Lens, a unified knowledge-guided foundation model for both network traffic classification and generation. In pretraining, we propose a Knowledge-Guided Mask Span Prediction method with textual context for learning knowledge-enriched representations. For extending to new classes in finetuning, we reframe the traffic classification as a closed-ended generation task and introduce context-aware finetuning to adapt the distribution shift. Evaluation results across various benchmark datasets demonstrate that the proposed Lens achieves superior performance on both classification and generation tasks. For traffic classification, Lens outperforms competitive baselines substantially on 8 out of 12 tasks with an average accuracy of 96.33% and extends to novel classes with significantly better performance. For traffic generation, Lens generates better high-fidelity network traffic for network simulation, gaining up to 30.46% and 33.3% better accuracy and F1 in fuzzing tests. We will open-source the code upon publication.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yanwei_Fu2
Submission Number: 6673
Loading