SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch

Jinsung Yoon; Kihyuk Sohn; Chun-Liang Li; Sercan O Arik; Tomas Pfister

SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch

Jinsung Yoon, Kihyuk Sohn, Chun-Liang Li, Sercan O Arik, Tomas Pfister

Published: 13 Feb 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Event Certifications: iclr.cc/ICLR/2024/Journal_Track

Abstract: Semi-supervised anomaly detection is a common problem, as often the datasets containing anomalies are partially labeled. We propose a canonical framework: Semi-supervised Pseudo-labeler Anomaly Detection with Ensembling (SPADE) that isn't limited by the assumption that labeled and unlabeled data come from the same distribution. Indeed, the assumption is often violated in many applications -- for example, the labeled data may contain only anomalies unlike unlabeled data, or unlabeled data may contain different types of anomalies, or labeled data may contain only `easy-to-label' samples. SPADE utilizes an ensemble of one class classifiers as the pseudo-labeler to improve the robustness of pseudo-labeling with distribution mismatch. Partial matching is proposed to automatically select the critical hyper-parameters for pseudo-labeling without validation data, which is crucial with limited labeled data. SPADE shows state-of-the-art semi-supervised anomaly detection performance across a wide range of scenarios with distribution mismatch in both tabular and image domains. In some common real-world settings such as model facing new types of unlabeled anomalies, SPADE outperforms the state-of-the-art alternatives by 5% AUC in average.

Certifications: Featured Certification

Submission Length: Regular submission (no more than 12 pages of main content)

Video: https://drive.google.com/file/d/1UvqBpvcPw3XQISI5ZI-upYdPlOUNqcvi/view?usp=sharing

Assigned Action Editor: ~Tongliang_Liu1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 610

Loading