Hardware Execution Time Prediction for Neural Network Layers

Adrian Osterwind, Julian Droste-Rehling, Manoj Rohit Vemparala, Domenik Helms

Published: 01 Jan 2022, Last Modified: 06 Nov 2023PKDD/ECML Workshops (1) 2022Readers: Everyone

Abstract: We present an estimation methodology, accurately predicting the execution time for a given embedded Artificial Intelligence (AI) accelerator and a neural network (NN) under analysis. The timing prediction is implemented as a python library called (MONNET) and is able to perform its predictions analyzing the Keras description of an NN under test within milliseconds. This enables several techniques to design NNs for embedded hardware. Designers can avoid training networks which could be functionally sufficient but will likely fail the timing requirements. The technique can also be included into automated network architecture search algorithms, enabling exact hardware execution times to become one contributor to the search’s target function. In order to perform precise estimations for a target hardware, each new hardware needs to undergo an initial automatic characterization process, using tens of thousands of different small NNs. This process may need several days, depending on the hardware. We tested our methodology for the Intel Neural Compute Stick 2, where we could achieve an (RMSPE) below 21% for a large range of industry relevant NNs from vision processing.

0 Replies