An improved deep forest for alleviating the data imbalance problem

Jie Gao; KunHong Liu; Beizhan Wang; Dong Wang; Qingqi Hong

An improved deep forest for alleviating the data imbalance problem

Jie Gao, KunHong Liu, Beizhan Wang, Dong Wang, Qingqi Hong

Published: 01 Jan 2021, Last Modified: 30 Oct 2024Soft Comput. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Most deep learning methods have inherent defects and are rarely applied in the classification task of small-sized imbalanced datasets. On the one hand, data imbalance causes the classification results of the model to be biased toward the majority class. On the other hand, limited training data results in over-fitting. Deep forest (DF) is an interesting deep learning model that can perfectly work on small-sized datasets, and its performance is highly competitive with deep neural networks. In the present study, a variant of the DF called the imbalanced deep forest (IMDF) is proposed to effectively improve the classification performance of the minority class. It aims to explore the application of deep learning on small-sized imbalanced datasets. The IMDF is the cascade of multiple layers, where each layer is the ensemble of multiple units. The main idea behind the proposed method is to enable each unit of the IMDF to handle imbalanced data so that the classification results of the entire IMDF are biased toward minority class. Performed experiments demonstrate the effectiveness of the proposed method.

Loading