BESTI: BAYESIAN CLASS-WISE TRUST-WEIGHTED ENSEMBLE WITH STRUCTURED SAMPLING FOR IMBALANCED MULTI-CLASS CLASSIFICATION
Keywords: Data Imbalance, Class Imbalance, Ensemble, Classification
Abstract: Class imbalance is caused by both practical and structural reasons, such as variance in occurrence frequency, biases in collection environments, differences in labeling costs, and imbalances in conceptual definitions. Such an imbalance introduces diverse problems, including under-representation of minority classes, distortion of metrics, and deterioration of model fairness and generalization capabilities. To challenge this issue, we propose a Bayesian Class-wise Trust-Weighted Ensemble with Structured Sampling for Imbalanced Multi-Class Classification, named BESTI. BESTI starts by constructing multiple sub-training sets from the original dataset that represent varying degrees of data imbalance. By initializing clipping thresholds in a structured manner, classes with larger sample counts than thresholds are downsampled, and others are retained. This creates a series of training sets, each reflecting a different class distribution. After, independent models are trained from each of the training sets, generating multiple specialized models for a certain degree of imbalance. We aggregate these models taking their trustworthiness into account. Based on Bayes’ theorem, this trustworthiness is equivalent to the class-wise precision of the model. Utilizing this precision as a weight, BESTI ensembles multiple models to make the final decision. Our test results show that BESTI successfully improves the overall performance of the model, including the minority classes. In addition to that, BESTI shows competitiveness compared to state-of-the-art methods, often outperforming them significantly in certain domains.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 17529
Loading