Hardware-aware Exponential Approximation for Deep Neural Network

Xue Geng, Jie Lin, Bin Zhao, Zhe Wang, Mohamed M. Sabry Aly, Vijay Chandrasekhar

Feb 12, 2018 (modified: Feb 12, 2018) ICLR 2018 Workshop Submission readers: everyone
  • Abstract: In this paper, we address the problem of cost-efficient inference for non-linear operations in deep neural networks (DNNs), in particular, the exponential function exin softmax layer of DNNs for object detection. The goal is to minimize the hardware cost in terms of energy and area, while maintaining the application accuracy. To this end, we introduce Piecewise Linear Function (PLF) for approximating ex. First, we derive a theoretical upper bound of the number of pieces required for retaining the detection accuracy. Moreover, we constrain PLF to bounded domain in order to minimize bitwidths of the lookup table of pieces, resulting in lower energy and area cost. The non-differentiable bounded PLF layer can be optimized via the straight-through estimator. ASIC synthesis demonstrates that the hardware-oriented softmax costs 4x less energy and area than the direct lookup table of ex, while with comparable detection accuracy on benchmark datasets.
  • TL;DR: Hardware-aware approximation for exponential function.
  • Keywords: Hardware, Softmax, Exponential, Deep Learning