- Abstract: Deep Neural Networks (DNNs) have advanced the state-of-the-art on a variety of machine learning tasks and are deployed widely in many real-world products. However, the compute and data requirements demanded by large-scale DNNs remains a significant challenge. In this work, we address this challenge in the context of DNN inference. We propose Dynamic Variable Effort Deep Neural Networks (DyVEDeep), which exploit the heterogeneity in the characteristics of inputs to DNNs to improve their compute efficiency while maintaining the same classification accuracy. DyVEDeep equips DNNs with dynamic effort knobs, which in course of processing an input, identify how critical a group of computations are to classify the input. DyVEDeep dynamically focuses its compute effort only on the critical computations, while the skipping/approximating the rest. We propose 3 effort knobs that operate at different levels of granularity viz. neuron, feature and layer levels. We build DyVEDeep versions for 5 popular image recognition benchmarks on 3 image datasets---MNIST, CIFAR and ImageNet. Across all benchmarks, DyVEDeep achieves 2.1X-2.6X reduction in number of scalar operations, which translates to 1.9X-2.3X performance improvement over a Caffe-based sequential software implementation, for negligible loss in accuracy.
- Conflicts: purdue.edu, iitm.ac.in