Faster and Accurate Neural Networks with Semantic Inference

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Semantic Inference, Semantic Pruning, Deep Learning, Efficient Neural Networks
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We propose a class-semantics based approach for fast, accurate, and flexible inference for neural networks.
Abstract: Deep neural networks (DNNs) usually come with a significant computational and data labeling burden. While approaches such as structured pruning and mobile-specific DNNs have been proposed, they incur in drastic accuracy loss. Conversely from prior work, in this paper we leverage the intrinsic redundancy in latent representations to drastically reduce the computational load with very limited loss in performance. Specifically, we show that semantically similar inputs share a significant number of filter activations, especially in the earlier layers. As such, semantically similar classes can be “clustered” so as to create cluster-specific subgraphs. These may be “turned on” when an input belonging to a semantic cluster is being presented to the DNN, while the rest of the DNN can be “turned off”. To this end, we propose a new framework called Semantic Inference (SINF). In short, SINF (i) identifies the semantic cluster the object belongs to using a small additional classifier; and then (ii) executes the subgraph extracted from the base DNN related to that semantic cluster to perform the inference. To extract each cluster-specific subgraph, we propose a new approach named Discriminative Capability Score (DCS) that effectively finds the subgraph with the capability to discriminate among the members of a specific semantic cluster. Importantly, DCS is independent from SINF, as it is a general-purpose quantity that can be applied to any DNN. We benchmark the performance of DCS on the VGG16, VGG19, and ResNet50 DNNs trained on the CIFAR100 dataset against 6 state-of-the-art pruning approaches. Our results show that (i) SINF reduces the inference time of VGG19, VGG16, and ResNet50 respectively by up to 35%, 29% and 15% with only 0.17%, 3.75%, and 6.75% accuracy loss; (ii) DCS achieves respectively up to 3.65%, 4.25%, and 2.36% better accuracy with VGG16, VGG19, and ResNet50 with respect to existing discriminative scores; (iii) when used as a pruning criterion, DCS achieves up to 8.13% accuracy gain with 5.82% less parameters than the existing state of the art work published at ICLR 2023; (iv) when considering per-cluster accuracy, SINF performs on average 5.73%, 8.38% and 6.36% better than the base VGG16, VGG19, and ResNet50. We share our code for reproducibility.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6252
Loading