Selection of sub-optimal feature set of network data to implement Machine Learning models to develop an efficient NIDS

Jashanpreet Singh Sadioura, Satbir Singh, Amitava Das

Published: 2019, Last Modified: 13 Nov 2024ICDSE 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the rapid increase in the dependency on technology and the internet in our personal and professional life, the computer networks have become very congested, and the frequency of presence of an intrusion in a network has also increased. An active IDS (Intrusion Detection System) protects the network from intrusions and provide security to the system. Machine learning techniques are the most efficient technologies to develop IDS as substantial network data can be easily trained and tested using ML models. Any general machine learning models work in three phases: Data pre-processing, Feature selection and training, and testing the developed models. The major contribution of this paper is the extraction of sub-optimal feature set from NSL-KDD data set having 41 features and then implementing different ML models to find the best suitable model using this set of features. It is observed that the ML models SVC and MLPClassifier performed better as compared to CNN in terms of complexity, accuracy and training time when trained and tested using the selected optimal feature set. CNN is an excellent deep learning algorithm that gives good results for image data perform better in comparison to simple text data machine learning models like SVC and MLP Classifier. MLP Classifier gave a higher accuracy of 98.19%.