EXPLAINABLE AI-BASED DYNAMIC FILTER PRUNING OF CONVOLUTIONAL NEURAL NETWORKS

Muhammad Sabih; Frank Hannig; Jürgen Teich

EXPLAINABLE AI-BASED DYNAMIC FILTER PRUNING OF CONVOLUTIONAL NEURAL NETWORKS

Muhammad Sabih, Frank Hannig, Jürgen Teich

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: XAI, Pruning, CNN, Explainable-AI

Abstract: Filter pruning is one of the most effective ways to accelerate Convolutional Neural Networks (CNNs). Most of the existing works are focused on the static pruning of CNN filters. In dynamic pruning of CNN filters, existing works are based on the idea of switching between different branches of a CNN or exiting early based on the difficulty of a sample. These approaches can reduce the average latency of inference, but they cannot reduce the longest-path latency of inference. In contrast, we present a novel approach of dynamic filter pruning that utilizes explainable AI along with early coarse prediction in the intermediate layers of a CNN. This coarse prediction is performed using a simple branch that is trained to perform top-k classification. The branch either predicts the output class with high confidence, in which case, the rest of the computations are left out. Alternatively, the branch predicts the output class to be within a subset of possible output classes. After this coarse prediction, only those filters that are important for this subset of classes are utilized for further computations. The importances of filters for each output class are obtained using explainable AI. Using this architecture of dynamic pruning, we not only reduce the average latency of inference, but we can also reduce the longest-path latency of inference. Our proposed architecture for dynamic pruning can be deployed on different hardware platforms. We evaluate our approach using commonly used image classification models and datasets on CPU and GPU platforms and demonstrate speedup without significant overhead.

One-sentence Summary: This paper proposes dynamic pruning method, which utilizes early exit along with early coarse prediction based on explainable AI to reduce average latency of inference as well longest-path latency of inference for CNNs.

5 Replies

Loading