DGraph: A Large-Scale Financial Dataset for Graph Anomaly DetectionDownload PDF

Published: 17 Sept 2022, Last Modified: 12 Mar 2024NeurIPS 2022 Datasets and Benchmarks Readers: Everyone
Keywords: Graph Anomaly Detection, Dynamic Graph, Financial Fraudsters Detection.
Abstract: Graph Anomaly Detection (GAD) has recently become a hot research spot due to its practicability and theoretical value. Since GAD emphasizes the application and the rarity of anomalous samples, enriching the varieties of its datasets is fundamental. Thus, this paper present DGraph, a real-world dynamic graph in the finance domain. DGraph overcomes many limitations of current GAD datasets. It contains about 3M nodes, 4M dynamic edges, and 1M ground-truth nodes. We provide a comprehensive observation of DGraph, revealing that anomalous nodes and normal nodes generally have different structures, neighbor distribution, and temporal dynamics. Moreover, it suggests that 2M background nodes are also essential for detecting fraudsters. Furthermore, we conduct extensive experiments on DGraph. Observation and experiments demonstrate that DGraph is propulsive to advance GAD research and enable in-depth exploration of anomalous nodes.
Author Statement: Yes
TL;DR: This paper present DGraph, a real-world dynamic graph in the finance domain.
URL: https://dgraph.xinye.com
Dataset Url: Datasets: https://dgraph.xinye.com/ Registering an account on https://dgraph.xinye.com are required for downloading datasets
License: The dataset in this paper is licensed under a Custom (non-commercial) license. See official instructions https://dgraph.xinye.com/clause.
Supplementary Material: pdf
Contribution Process Agreement: Yes
In Person Attendance: Yes
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 6 code implementations](https://www.catalyzex.com/paper/arxiv:2207.03579/code)
18 Replies