Abstract: The Internet of Things (IoT) is revolutionizing society by connecting people, devices, and environments seamlessly and providing enhanced user experience and functionalities. Security and privacy issues remain mostly ignored. Attackers can compromise devices, inject spurious packets into an IoT network, and cause severe damage. Machine learning-based Network Intrusion Detection Systems (NIDS) are often designed to detect such attacks. Most algorithms use labeled data for training the classifiers, which is difficult to obtain in a real-world setting. In this work, we propose a novel unsupervised machine learning approach that uses properties of the IoT dataset for anomaly detection. Specifically, we propose the use of Local Intrinsic Dimensionality (LID), a theoretical complexity measurement that assesses the local manifold surrounding a point. We use LID to evaluate three modern IoT network datasets empirically, showing that for network data generated using IoT methodologies, the LID estimates of benign network packets fit into low LID estimations. Further, we find that malicious examples exhibit higher LID estimates. We use this finding to propose a new unsupervised anomaly detection algorithm, the Weighted Hamming Distance LID Estimator, which incorporates an entropy weighted Hamming distance into the LID Maximum Likelihood Estimator algorithm. We show that our proposed approach performs better on IoT network datasets than the Autoencoder, KNN, and Isolation Forests. We test the algorithm on ToN IoT, NetFlow Bot-IoT (NF Bot-IoT), and Aposemat IoT-23 (IoT-23) datasets, using leave-one-out validation to compare results.
0 Replies
Loading