Anytime Neural Network: a Versatile Trade-off Between Computation and Accuracy

Hanzhang Hu; Debadeepta Dey; Martial Hebert; J. Andrew Bagnell

Anytime Neural Network: a Versatile Trade-off Between Computation and Accuracy

Hanzhang Hu, Debadeepta Dey, Martial Hebert, J. Andrew Bagnell

15 Feb 2018 (modified: 10 Feb 2022)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: We present an approach for anytime predictions in deep neural networks (DNNs). For each test sample, an anytime predictor produces a coarse result quickly, and then continues to refine it until the test-time computational budget is depleted. Such predictors can address the growing computational problem of DNNs by automatically adjusting to varying test-time budgets. In this work, we study a \emph{general} augmentation to feed-forward networks to form anytime neural networks (ANNs) via auxiliary predictions and losses. Specifically, we point out a blind-spot in recent studies in such ANNs: the importance of high final accuracy. In fact, we show on multiple recognition data-sets and architectures that by having near-optimal final predictions in small anytime models, we can effectively double the speed of large ones to reach corresponding accuracy level. We achieve such speed-up with simple weighting of anytime losses that oscillate during training. We also assemble a sequence of exponentially deepening ANNs, to achieve both theoretically and practically near-optimal anytime results at any budget, at the cost of a constant fraction of additional consumed budget.

TL;DR: By focusing more on the final predictions in anytime predictors (such as the very recent Multi-Scale-DenseNets), we make small anytime models to outperform large ones that don't have such focus.

Keywords: anytime, neural network, adaptive prediction, budgeted prediction

Data: [CIFAR-100](https://paperswithcode.com/dataset/cifar-100)

4 Replies

Loading