Approximate LSTMs for Time-Constrained Inference: Enabling Fast Reaction in Self-Driving Cars

Alexandros Kouris, Stylianos I. Venieris, Michail Rizakis, Christos-Savvas Bouganis

2020 (modified: 16 Jul 2021)IEEE Consumer Electron. Mag. 2020Readers: Everyone

Abstract: The need to recognize long-term dependencies in sequential data, such as video streams, has made long short-term memory (LSTM) networks a prominent artificial intelligence model for many emerging applications. However, the high computational and memory demands of LSTMs introduce challenges in their deployment on latency-critical systems such as self-driving cars, which are equipped with limited computational resources on-board. In this article, we introduce a progressive inference computing scheme that combines model pruning and computation restructuring leading to the best possible approximation of the result given the available latency budget of the target application. The proposed methodology enables mission-critical systems to make informed decisions even in early stages of the computation, based on approximate LSTM inference, meeting their specifications on safety and robustness. Our experiments on a state-of-the-art driving model for autonomous vehicle navigation demonstrate that the proposed approach can yield outputs with similar quality of result compared to a faithful LSTM baseline, up to 415× faster (198× on average, 76× geo. mean).

0 Replies