Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Hardware-aware Exponential Approximation for Deep Neural Network
Xue Geng, Jie Lin, Bin Zhao, Zhe Wang, Mohamed M. Sabry Aly, Vijay Chandrasekhar
Feb 12, 2018 (modified: Jun 04, 2018)ICLR 2018 Workshop Submissionreaders: everyoneShow Bibtex
Abstract:In this paper, we address the problem of cost-efficient inference for non-linear operations in deep neural networks (DNNs), in particular, the exponential function exin softmax layer of DNNs for object detection. The goal is to minimize the hardware cost in terms of energy and area, while maintaining the application accuracy. To this end, we introduce Piecewise Linear Function (PLF) for approximating ex. First, we derive a theoretical upper bound of the number of pieces required for retaining the detection accuracy. Moreover, we constrain PLF to bounded domain in order to minimize bitwidths of the lookup table of pieces, resulting in lower energy and area cost. The non-differentiable bounded PLF layer can be optimized via the straight-through estimator. ASIC synthesis demonstrates that the hardware-oriented softmax costs 4x less energy and area than the direct lookup table of ex, while with comparable detection accuracy on benchmark datasets.
Keywords:Hardware, Softmax, Exponential, Deep Learning
TL;DR:Hardware-aware approximation for exponential function.
Enter your feedback below and we'll get back to you as soon as possible.