A Classifier-Based Approach to Multi-Class Anomaly Detection Applied to Astronomical Time-Series

Published: 17 Jun 2024, Last Modified: 26 Jul 2024ICML2024-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Astrophysics, Astronomy, Anomaly Detection, Classification, Recurrent Neural Networks, Time-series, Isolation Forests
TL;DR: We introduce a novel anomaly detection method called Multi-Class Isolation Forests (MCIF) that repurposes neural network classifiers for real-time anomaly detection in astronomical time-series data, outperforming standard isolation forests.
Abstract: Automating anomaly detection is an open problem in many scientific fields, particularly in time-domain astronomy, where modern telescopes generate millions of alerts per night. Currently, most anomaly detection algorithms for astronomical time-series rely either on hand-crafted features or on features generated through unsupervised representation learning, coupled with standard anomaly detection algorithms. In this work, we introduce a novel approach that leverages the latent space of a neural network classifier for anomaly detection. We then propose a new method called Multi-Class Isolation Forests (MCIF), which trains separate isolation forests for each class to derive an anomaly score for an object based on its latent space representation. This approach significantly outperforms a standard isolation forest when distinct clusters exist in the latent space. Using a simulated dataset emulating the Zwicky Transient Facility (54 anomalies and 12,040 common), our anomaly detection pipeline discovered $46\pm3$ anomalies ($\sim 85\%$ recall) after following up the top 2,000 ($\sim 15\%$) ranked objects. Furthermore, our classifier-based approach outperforms or approaches the performance of other state-of-the-art anomaly detection pipelines when applied to the dataset used in Perez-Carrasco et al. (2023). Our novel method demonstrates that existing and new classifiers can be effectively repurposed for real-time anomaly detection. The code used in this work, including a Python package, is publicly available.
Submission Number: 159
Loading