Abstract: Developing reliable models for detecting fraud is increasingly vital amidst the digitalization of our financial institutions and activities. While graph-based fraud detection models showed promising performances in recent years, the underpinning fraud definition and evaluation process they utilized rely heavily on a limited set of labeled datasets. This raises concerns about their vulnerability to distributional shifts and adversarial strategies often exhibited by real-life fraudsters. In response, we propose a novel scenario for graph fraud detection called Multi-round Adversarial Fraud Detection. Here, the fraud detection model is trained and evaluated iteratively on an adversarially evolving graph. This scenario more closely mimics fraud detection activities in real life while being less reliant on the underlying datasets since the graph evolution is induced by generator functions rooted in common fraud behavior. We show that existing models struggle to achieve good multi-round performance under the proposed scenario with F1 scores that consistently hover below 56 percent on subsequent rounds. To improve the aforementioned performance we propose Temporally Pre-trained Node Embedder (TPNE), a module that leverages self-supervised pre-training approach, explicitly separating and enhancing temporal information across multiple rounds. TPNE is both model and label-agnostic, improving the best-performing baseline by up to 4.6 percent in on-round F1 score and up to 32.9 percent in final recall.
Loading