Efficient Recurrent Neural Networks using Structured Matrices in FPGAs

Zhe Li; Shuo Wang; Caiwen Ding; Qinru Qiu; Yanzhi Wang; Yun Liang

Efficient Recurrent Neural Networks using Structured Matrices in FPGAs

Zhe Li, Shuo Wang, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Yun Liang

12 Feb 2018 (modified: 05 May 2023)ICLR 2018 Workshop SubmissionReaders: Everyone

Abstract: Recurrent Neural Networks (RNNs) are becoming increasingly important for time series-related applications which require efficient and real-time implementations.The recent pruning based work \textit{ESE}~\citep{han2017ese} suffers from degradation of performance/energy efficiency due to the irregular network structure after pruning. We propose block-circulant matrices for weight matrix representation in RNNs, thereby achieving simultaneous model compression and acceleration. We aim to implement RNNs in FPGA with highest performance and energy efficiency, with certain accuracy requirement (negligible accuracy degradation). Experimental results on actual FPGA deployments shows that the proposed framework achieves a maximum energy efficiency improvement of 35.7$\times$ compared with ESE.

Keywords: Deep Learning, Speech Recognition, Model Compression, Hardware Acceleration, Circulant Matrix, FPGA

TL;DR: We propose block-circulant matrices for weight matrix representation in RNNs, thereby achieving simultaneous model compression and acceleration.

4 Replies

Loading