Enhancing VPN Traffic Recognition Through CatBoost Feature Extraction and Stacking Ensemble Learning
Abstract: A virtual private network (VPN) often serves as an accessory to conceal the online identities of malicious activities. The identification of VPN tunnels has become a prevalent method for detecting potential security threats or abnormalities. Nevertheless, current deep packet inspection and deep learning approaches encounter challenges such as limited scalability or low accuracy. We introduce a novel approach to address the problems by proposing a supervised protocol-wide flow representation learning approach. Our approach leverages the semantic information inherent in the protocol to generate optimal feature embeddings automatically. Additionally, we propose a stacking ensemble machine algorithm to enhance the accuracy of VPN tunnel identification using the generated feature embeddings. We have implemented a prototype named SA-VPN and conducted a comprehensive evaluation of its effectiveness and efficiency using a significant volume of VPN traffic flows. The results demonstrate that our tool surpasses the performance of current state-of-the-art VPN tunnel identification tools.
Loading