Keywords: uncertainty estimation, deep ensemble, dataset shift, robustness, uncertainty calibration, Bayesian neural networks
TL;DR: We propose methods for constructing ensembles of neural networks with varying architectures, demonstrating that they outperform deep and anchored ensembles, in terms of uncertainty calibration, predictive performance and robustness to dataset shift.
Abstract: Ensembles of neural networks achieve superior performance compared to standalone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift. Deep ensembles, a state-of-the-art method for uncertainty estimation, only ensemble random initializations of a fixed architecture. Instead, we propose two methods for automatically constructing ensembles with varying architectures, which implicitly trade-off individual architectures’ strengths against the ensemble’s diversity and exploit architectural variation as a source of diversity. On a variety of classification tasks and modern architecture search spaces, we show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift. Our further analysis and ablation studies provide evidence of higher ensemble diversity due to architectural variation, resulting in ensembles that can outperform deep ensembles, even when having weaker average base learners. To foster reproducibility, our code is available: https://github.com/automl/nes
Supplementary Material: pdf
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.