Fast and Accurate Inference with Adaptive Ensemble Prediction for Deep Networks


Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Ensembling multiple predictions is a widely-used technique to improve the accuracy of various machine learning tasks. In image classification tasks, for example, averaging the predictions for multiple patches extracted from the input image significantly improves accuracy. Using multiple networks trained independently to make predictions improves accuracy further. One obvious drawback of the ensembling technique is its higher execution cost during inference. This higher cost limits the real-world use of ensembling. In this paper, we describe a new technique called adaptive ensemble prediction, which achieves the benefits of ensembling with much smaller additional execution costs. Our observation behind this technique is that ensembling does not reduce mispredictions for inputs predicted with a high probability, i.e. the outputs from the softmax. Hence, we calculate the confidence level of the prediction for each input from the probabilities of the local predictions during the ensembling computation. If the prediction for an input reaches a high enough probability on the basis of the confidence level, we stop ensembling for this input to avoid wasting computation power. We evaluated the adaptive ensembling by using various datasets and showed that it reduces the computation cost significantly while achieving similar accuracy to the naive ensembling. We also showed that our statistically rigorous confidence-level-based termination condition reduces the burden of the task-dependent parameter tuning compared to the naive termination based on the pre-defined threshold in addition to yielding a better accuracy with the same cost.
  • Keywords: ensemble, confidence level