The Best Deep Ensembles Sacrifice Predictive Diversity

Taiga Abe; E. Kelly Buchanan; Geoff Pleiss; John Patrick Cunningham

The Best Deep Ensembles Sacrifice Predictive Diversity

Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, John Patrick Cunningham

02 Oct 2022 (modified: 05 May 2023)ICBINB talkReaders: Everyone

Keywords: ensemble, deep ensemble, predictive diversity

TL;DR: In deep ensembles, predictive diversity and ensemble performance appear to be negatively correlated, contrary to popular belief.

Abstract: Ensembling remains a hugely popular method for increasing the performance of a given class of models. In the case of deep learning, the benefits of ensembling are often attributed to the diverse predictions of the individual ensemble members. Here we investigate a tradeoff between diversity and individual model performance, and find that--surprisingly--encouraging diversity during training almost always yields worse ensembles. We show that this tradeoff arises from the Jensen gap between the single model and ensemble losses, and show that Jensen gap is a natural measure of diversity for both the mean squared error and cross entropy loss functions. Our results suggest that to reduce the ensemble error, we should move away from efforts to increase predictive diversity, and instead we should construct ensembles from less diverse (but more accurate) component models.

0 Replies

Loading