GATS-C4.5: An Algorithm for Optimizing Features in Flow Classification

Published: 2008, Last Modified: 13 Jan 2026CCNC 2008EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Flow classifier deals with huge amount of data, which contains irrelevant and redundant features causing slower training and testing process, higher resource consumption as well as poor classification accuracy. Optimizing features, therefore, is an important issue in flow classification. In this paper, we propose a wrapper feature selection algorithm GATS-C4.5 aiming at modeling lightweight flow classifier by (1) using hybrid genetic-tabu approach as search strategy to specify candidate subsets for evaluation; (2) using C4.5 algorithm as wrapper approach to obtain the optimum feature subset. We have examined the feasibility of our algorithm by conducting several experiments on flow datasets which were categorized as WWW, MAIL, P2P, etc. The experimental results show that classifier with our approach can greatly improve computational performance without negative impact on classification accuracy. Further more, our approach is able not only to have smaller resource consumption, but also to have higher classification accuracy than Naive Bayes method with Kernel density estimation after Fast Correlation-Based Filter (NBK-FCBF).
Loading