SOI: Scaling down computational complexity by estimating partial states of the model

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Convolutional neural networks, Time series data, Inference at the edge, Computational complexity reduction, Causality, Real-Time results
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Consumer electronics used to follow the miniaturization trend described by Moore’s Law. Despite the continuous growth in the processing power of Microcontroller Units (MCUs), the MCUs used in the smallest appliances are still not capable of running even moderately big, state-of-the-art artificial neural networks (ANNs). Deploying ANNs on this class of devices becomes even more challenging when they are required to operate in a time-sensitive manner, as the model’s inference cannot be distributed over time. In this work, we present a novel method called Scattered Online Inference (SOI) that aims to reduce the computational complexity of ANNs. SOI is developed based on the premise that time-series data is continuous and/or seasonal, and so are the model’s predictions. This applied extrapolation leads to processing speed improvements, especially in the deeper layers of the model. The application of strides forces the ANN to produce more general inner partial states of the model, as they are based on a higher number of input samples that lie further apart from each other. As a result, SOI allows skipping full model recalculation at each inference by performing only the strictly necessary operations. We present two possible patterns of inference achievable with SOI - Partially Predictive (PP) and Fully Predictive (FP). For the audio separation task, we achieved a 64.4% reduction in computational complexity at the cost of 9.8% of SI-SNRi for the PP variant, and a 41.9% reduction at the cost of 7.70% SI-SNRi with the FP variant. Moreover, the latter variant reduces inference time by an additional 28.7%. Similar results are also presented for the acoustic scene classification task with a model based on the GhostNet architecture.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7827
Loading