Towards a Unified Framework of Clustering-based Anomaly Detection

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: A unified theoretical framework models the intrinsic connections among representation learning, clustering, and anomaly detection.
Abstract: Unsupervised Anomaly Detection (UAD) plays a crucial role in identifying abnormal patterns within data without labeled examples, holding significant practical implications across various domains. Although the individual contributions of representation learning and clustering to anomaly detection are well-established, their interdependencies remain under-explored due to the absence of a unified theoretical framework. Consequently, their collective potential to enhance anomaly detection performance remains largely untapped. To bridge this gap, in this paper, we propose a novel probabilistic mixture model for anomaly detection to establish a theoretical connection among representation learning, clustering, and anomaly detection. By maximizing a novel anomaly-aware data likelihood, representation learning and clustering can effectively reduce the adverse impact of anomalous data and collaboratively benefit anomaly detection. Meanwhile, a theoretically substantiated anomaly score is naturally derived from this framework. Lastly, drawing inspiration from gravitational analysis in physics, we have devised an improved anomaly score that more effectively harnesses the combined power of representation learning and clustering. Extensive experiments, involving 17 baseline methods across 30 diverse datasets, validate the effectiveness and generalization capability of the proposed method, surpassing state-of-the-art methods.
Lay Summary: In many real-world applications like detecting financial fraud or identifying unusual medical conditions, we need computers to automatically spot abnormal patterns in data—but often we don't have examples of what "abnormal" looks like to teach them. We developed a new mathematical framework that helps computers detect these anomalies by combining three key abilities: understanding data patterns, grouping similar things together, and identifying outliers—all working in harmony rather than separately. Our approach is inspired by how gravity works in physics, where objects influence each other based on their mass and distance. This unified method significantly improves the accuracy of anomaly detection across 30 different datasets, outperforming existing approaches. This matters because better anomaly detection can help prevent credit card fraud, catch manufacturing defects early, identify potential health issues before they become serious, and protect computer systems from cyberattacks.
Primary Area: Deep Learning->Generative Models and Autoencoders
Keywords: Anomaly Detection, Clustering
Submission Number: 9292
Loading