An Empirical Study on Anomaly detection Using Density Based and Representative Based Clustering algorithmsDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Anomaly, Outliers, Noise points, ANN, DBSCAN, DBSCAN++, k-means - - (minus minus)
TL;DR: In this paper, we focus on existing anomaly detection approaches, by empirically studying the performance of unsupervised anomaly detection techniques.
Abstract: In data mining, and statistics, anomaly detection is the process of finding data patterns (outcomes, values, or observations) that deviate from the rest of the d other observations or outcomes. Anomaly detection is heavily used in solving real-world problems in many application domains like medicine, finance, cybersecurity, banking, networking, transportation, and military surveillance for enemy activities, but not limited to only those fields. In this paper, we present an empirical study of unsupervised anomaly detection techniques such as DBSCAN, DBSCAN++ (with uniform initialization, k-center initialization, uniform with approximate neighbor initialization, and k-center with approximate neighbor initialization), and k-means --(minus minus) algorithms on six benchmark imbalanced datasets. Findings from our in-depth empirical study show that k-means -- is a robust than DBSCAN, and DBSCAN++, in terms of the different evaluation measures (F1 score, False alarm rate, adjusted rand index, and Jaccard coefficient), and running time. We also observe that DBSCAN performs very well on datasets with fewer number of data points.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Machine Learning for Sciences (eg biology, physics, health sciences, social sciences, climate/sustainability )
6 Replies

Loading