Advancing On-Device Neural Network Training with TinyPropv2: Dynamic, Sparse, and Efficient Backpropagation

Marcus Rüb, Axel Sikora, Daniel Mueller-Gritschneder

Published: 01 Jan 2024, Last Modified: 23 Feb 2025IJCNN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This study introduces EmbeddedTrain, an innovative algorithm optimized for on-device learning in deep neural networks, specifically designed for low-power microcontroller units. EmbeddedTrain refines sparse backpropagation by dynamically adjusting the level of sparity, including the ability to selectively skip training steps. This feature significantly lowers computational effort without substantially compromising accuracy. Our comprehensive evaluation across diverse datasets—CIFAR 10, CIFAR100, Flower, Food, Speech Command, MNIST, HAR, and DCASE2020—reveals that EmbeddedTrain achieves near-parity with full training methods, with an average accuracy drop of only around 1% in most cases. For instance, against full training, EmbeddedTrain’s accuracy drop is minimal, for example, only 0.82% on CIFAR 10 and 1.07% on CIFAR100. In terms of computational effort, EmbeddedTrain shows a marked reduction, requiring as little as 10% of the computational effort needed for full training in some scenarios, and consistently outperforms other sparse training methodologies. These findings underscore EmbeddedTrain’s capacity to efficiently manage computational resources while maintaining high accuracy, positioning it as an advantageous solution for advanced embedded device applications in the IoT ecosystem.