ProMoS: Prototype-Guided Distillation for Generalist Graph Anomaly Detection

Yiming Xu; Zihan Chen; Zhen Peng; Song Wang; Bin Shi; Bo Dong; Chao Shen

ProMoS: Prototype-Guided Distillation for Generalist Graph Anomaly Detection

Yiming Xu, Zihan Chen, Zhen Peng, Song Wang, Bin Shi, Bo Dong, Chao Shen

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Graph anomaly detection, Graph neural network, Knowledge distillation

Abstract: Graph anomaly detection (GAD) is crucial in high-stakes domains. Recently, generalist GAD is a type of GAD that trains a single detector and can be transferred to new graphs, and has attracted attention. However, existing methods often rely on scarce and costly annotations for training and sometimes even require few-shot support at inference, which limits their robustness to diverse and unseen anomaly patterns. To address this limitation, we introduce ProMoS, the first unsupervised generalist GAD framework, which detects anomalies by modeling the abundant normality in unlabeled data. Specifically, we introduce a knowledge-distillation (KD) architecture that distills normality representations from a frozen self-supervised graph neural network (GNN) teacher to a mixture-of-students (MoS) model. The MoS employs a shared branch to capture global patterns and a lightweight personalized branch to extract local normality from the teacher, avoiding learning normality from scratch while improving both expressiveness and efficiency. Second, we propose prototype-guided soft-label distillation to align the student with the teacher in a shared prototype space, thereby improving cross-graph transferability and generalizability. During inference, ProMoS performs zero-shot anomaly detection on unseen graphs based on teacher-student distillation bias and prototype geometric deviation. Extensive experiments on eleven zero-shot GAD tasks show that ProMoS consistently outperforms state-of-the-art supervised, unsupervised, and generalist baselines while reducing computational overhead, charting a practical path toward label-free, zero-shot generalist GAD.

Primary Area: learning on graphs and other geometries & topologies

Submission Number: 12620

Loading