MEMENTO: A novel approach for class incremental learning of encrypted traffic

Francesco Cerasuolo, Alfredo Nascita, Giampaolo Bovenzi, Giuseppe Aceto, Domenico Ciuonzo, Antonio Pescapè, Dario Rossi

Published: 01 Jan 2024, Last Modified: 01 Oct 2024Comput. Networks 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the ever-changing digital environment, ensuring the ongoing effectiveness of traffic analysis and security measures is crucial. Therefore, Class Incremental Learning (CIL) in encrypted Traffic Classification (TC) is essential for adapting to evolving network behaviors and the rapid development of new applications. However, the application of CIL techniques in the TC domain is not straightforward, usually leading to unsatisfactory performance figures. Specifically, the improvement goal is to reduce forgetting on old apps and increase the capacity in learning new ones, in order to improve overall classification performance—reducing the drop from a model “trained-from-scratch”.The contribution of this work is the design of a novel fine-tuning approach called MEMENTO<math><mstyle mathvariant="monospace" is="true"><mi is="true">M</mi><mi is="true">E</mi><mi is="true">M</mi><mi is="true">E</mi><mi is="true">N</mi><mi is="true">T</mi><mi is="true">O</mi></mstyle></math>, which is obtained through the careful design of different building blocks: memory management, model training, and rectification strategies. In detail, we propose the application of traffic biflows augmentation strategies to better capitalize on old apps biflows, we introduce improvements in the distillation stage, and we design a general rectification strategy that includes several existing proposals.To assess our proposal, we leverage two publicly-available encrypted network traffic datasets, i.e., MIRAGE19<math><mstyle mathvariant="monospace" is="true"><mi is="true">M</mi><mi is="true">I</mi><mi is="true">R</mi><mi is="true">A</mi><mi is="true">G</mi><mi is="true">E</mi><mi is="true">1</mi><mi is="true">9</mi></mstyle></math> and CESNET<math><mstyle mathvariant="monospace" is="true"><mi is="true">C</mi><mi is="true">E</mi><mi is="true">S</mi><mi is="true">N</mi><mi is="true">E</mi><mi is="true">T</mi></mstyle></math>-TLS22<math><mstyle mathvariant="monospace" is="true"><mi is="true">T</mi><mi is="true">L</mi><mi is="true">S</mi><mi is="true">2</mi><mi is="true">2</mi></mstyle></math>. As a result, on both datasets MEMENTO<math><mstyle mathvariant="monospace" is="true"><mi is="true">M</mi><mi is="true">E</mi><mi is="true">M</mi><mi is="true">E</mi><mi is="true">N</mi><mi is="true">T</mi><mi is="true">O</mi></mstyle></math> achieves a significant improvement in classifying new apps (w.r.t. the best-performing alternative, i.e., BiC<math><mstyle mathvariant="monospace" is="true"><mi is="true">B</mi><mi is="true">i</mi><mi is="true">C</mi></mstyle></math>) while maintaining stable performance on old ones. Equally important, MEMENTO<math><mstyle mathvariant="monospace" is="true"><mi is="true">M</mi><mi is="true">E</mi><mi is="true">M</mi><mi is="true">E</mi><mi is="true">N</mi><mi is="true">T</mi><mi is="true">O</mi></mstyle></math> achieves satisfactory overall TC performance, filling the gap toward a trained-from-scratch model and offering a considerable gain in terms of time (up to 10×<math><mrow is="true"><mn is="true">10</mn><mo is="true">×</mo></mrow></math> speed-up) to obtain up-to-date and running classifiers. The experimental evaluation relies on a comprehensive performance evaluation workbench for CIL proposals, which is based on a wider set of metrics (as opposed to the existing literature in TC).