Abstract: Network fraud detection, specifically identifying abnormal users on rating platforms, has attracted considerable interests of researchers due to its wide applicability. However, the performance of existing detection systems suffer from several challenging problems such as class imbalance, lack of annotated data and network sparsity. To address above challenges, in this paper, we propose a novel unsupervised fraud detection algorithm FD-SpaN based on network structure exploration, to effectively rank users based on computed probabilities of being fraudulent and identify abnormal users on sparse networks. Firstly, we model ratings networks as graphs in mathematical manner with introduced metrics. Then, we add variable smoothing terms accordingly when inferring the quality and trustworthiness of each item and rating respectively, to tackle network sparsity on entity level. Meanwhile, for active users, we integrate their rating patterns into our developed formulations as a critical term to avoid overfitting. In addition, our proposed FD-SpaN is scalable to large-scale rating networks in real world due to its linear time complexity with respect to the size of network. Extensive experiments on two real-world datasets show the effectiveness of FD-SpaN under extreme class imbalance and network sparsity, as it outperforms other state-of-the-art baselines in terms of all evaluation metrics.
Loading