Imbalanced Classification via a Tabular Translation GAN

TMLR Paper743 Authors

01 Jan 2023 (modified: 02 Apr 2023)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: When presented with a classification problem where the data exhibits severe class imbalance, most standard predictive methods may fail to accurately model the minority class. We present a novel framework based on Generative Adversarial Networks which introduces a direct translation loss in conjunction with optional cyclic and identity losses to map majority samples to corresponding synthetic minority samples. We demonstrate that this translation mechanism encourages the synthesized samples to be close to the class boundary. Furthermore, we explore a selection criterion to retain the most useful of the synthesized samples. We conduct extensive experiments on tabular class-imbalanced data, including huge datasets, using several downstream classifiers. These empirical results show that the proposed method improves average precision when compared to alternative re-weighting and oversampling techniques.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yingzhen_Li1
Submission Number: 743
Loading