Small margin ensembles can be robust to class-label noiseDownload PDF

27 Sept 2020OpenReview Archive Direct UploadReaders: Everyone
Abstract: Subsampling is used to generate bagging ensembles that are accurate and robust to class-label noise. Theeffect of using smaller bootstrap samples to train the base learners is to make the ensemble morediverse. As a result, the classification margins tend to decrease. In spite of having small margins, theseensembles can be robust to class-label noise. The validity of these observations is illustrated in a widerange of synthetic and real-world classification tasks. In the problems investigated, subsamplingsignificantly outperforms standard bagging for different amounts of class-label noise. By contrast, theeffectiveness of subsampling in random forest is problem dependent. In these types of ensembles thebest overall accuracy is obtained when the random trees are built on bootstrap samples of the same sizeas the original training data. Nevertheless, subsampling becomes more effective as the amount of class-label noise increases
0 Replies

Loading