HADS: Hardware-Aware Deep Subnetworks

Published: 05 Mar 2024, Last Modified: 12 May 2024PML4LRS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Embedded Deep Neural Networks, Model Pruning, Model Dynamic Adaptation
TL;DR: We propose Hardware-Aware Deep Subnetworks (HADS) to tackle model adaptation to dynamic resource contraints. In contrast to the state-of-the-art, HADS use structured sparsity constructively by exploiting permutation invariance of neurons.
Abstract: We propose Hardware-Aware Deep Subnetworks (HADS) to tackle model adaptation to dynamic resource contraints. In contrast to the state-of-the-art, HADS use structured sparsity constructively by exploiting permutation invariance of neurons, which allows for hardware-specific optimizations. HADS achieve computational efficiency by skipping sequential computational blocks identified by a novel iterative knapsack optimizer. HADS support conventional deep networks frequently deployed on low-resource edge devices and provide computational benefits even for small and simple networks. We evaluate HADS on six benchmark architectures trained on the Google Speech Commands, Fashion-MNIST and CIFAR10 datasets, and test on four off-the-shelf mobile and embedded hardware platforms. We provide a theoretical result and empirical evidence for HADS outstanding performance in terms of submodels' test set accuracy, and demonstrate an adaptation time in response to dynamic resource constraints of under 40$\mu$s, utilizing a 2-layer fully-connected network on Arduino Nano 33 BLE Sense.
Submission Number: 52
Loading