Learning Deep ResNet Blocks Sequentially using Boosting Theory

Furong Huang; Jordan T. Ash; John Langford; Robert E. Schapire

Learning Deep ResNet Blocks Sequentially using Boosting Theory

Furong Huang, Jordan T. Ash, John Langford, Robert E. Schapire

15 Feb 2018 (modified: 15 Feb 2018)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: We prove a multiclass boosting theory for the ResNet architectures which simultaneously creates a new technique for multiclass boosting and provides a new algorithm for ResNet-style architectures. Our proposed training algorithm, BoostResNet, is particularly suitable in non-differentiable architectures. Our method only requires the relatively inexpensive sequential training of T "shallow ResNets". We prove that the training error decays exponentially with the depth T if the weak module classifiers that we train perform slightly better than some weak baseline. In other words, we propose a weak learning condition and prove a boosting theory for ResNet under the weak learning condition. A generalization error bound based on margin theory is proved and suggests that ResNet could be resistant to overfitting using a network with l_1 norm bounded weights.

TL;DR: We prove a multiclass boosting theory for the ResNet architectures which simultaneously creates a new technique for multiclass boosting and provides a new algorithm for ResNet-style architectures.

Keywords: residual network, boosting theory, training error guarantee

11 Replies

Loading