Constrained Optimization to Improve Critical Rare Classes Performance Within the Top-Ranking Part

Published: 2025, Last Modified: 15 Jan 2026ECML/PKDD (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The widespread application of deep learning methods has brought to the challenge of enhancing prediction performance within the highest-score segment of model predictions. In critical domains such as insurance fraud detection and bank cash-out detection, the focus is predominantly on the highest predicted scores, which correspond to high-risk users that need to be intercepted. However, most existing work still focuses on optimizing AUC globally, which often means not being the best within the top-ranking part. Besides, these scenarios often face extreme data imbalance, where the positive samples of interest are in the minority. In this paper, we define the top-ranking optimization problem and propose an Augmented Lagrangian Multiplier method (ALM) based approach to solve it. Specifically, we modify the Discounted Cumulative Gain (DCG) metric to serve as the constraint on top-ranking and add it as the regularization terms to the optimization objective. In addition, to ensure the effectiveness of the regularization term and avoid the overfitting problem, we design a dynamically updated cache mechanism to store the hard samples. Our experimental results on three real-world datasets validate the effectiveness of our proposed method, demonstrating its potential to improve top-ranking prediction performance in imbalanced data settings.
Loading