SSCL-IDS: Enhancing Generalization of Intrusion Detection with Self-Supervised Contrastive Learning

Pegah Golchin, Nima Rafiee, Mehrdad Hajizadeh, Ahmad Khalil, Ralf Kundel, Ralf Steinmetz

Published: 01 Jan 2024, Last Modified: 11 May 2025IFIP Networking 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the increasing diversity and complexity of cyber attacks on computer networks, there is a growing demand for Intrusion Detection Systems (IDS) that can accurately categorize new unknown network flows. Machine learning- based IDS (ML-IDS) offers a potential solution by learning underlying network traffic characteristics. However, ML-IDS encounters performance degradation in predicting the traffic with a different distribution from its training dataset (i.e., new unseen data), especially for attacks that mimic benign (non- attack) traffic (e.g., multi-stage attacks). Diversity in attack types intensifies the lack of labeled attack traffic, which leads to reduced detection performance and generalization capabilities of ML-IDS. The generalization refers to the model's capacity to identify new and unseen samples, even in cases where their distribution deviates from the training data used for the ML-IDS. To address these issues, this paper introduces SSCL-IDS, a Self-Supervised Contrastive Learning IDS designed to increase the generalization of ML-IDS. The proposed SSCL-IDS is exclusively trained on benign flows, enabling it to acquire a generic representation of benign traffic patterns and reduce the reliance on annotated network traffic datasets. The proposed SSCL-IDS demonstrates a substantial improvement in detection and generalization across diverse datasets compared to supervised (over 27%) and unsupervised (over 15%) baselines due to its ability to learn a more effective representation of benign flow attributes. Additionally, by leveraging transfer learning with SSCL-IDS as a pretrained model, we achieve AUROC scores surpassing 80% when fine-tuning with less than 20 training samples. Without fine-tuning, the average AUROC score across different datasets resembles random guessing.