Keywords: Model Pruning, Explainable AI (XAI), Object Detection, Deep Neural Networks, Model Compression, SHAP, Layer-wise Pruning, Efficient Inference
TL;DR: We propose an explainability-driven, SHAP-based layer-wise pruning method that achieves a superior accuracy-efficiency trade-off for object detection models compared to traditional magnitude-based techniques.
Abstract: Deep neural networks (DNNs) have achieved remarkable success in object detection tasks, but their increasing complexity poses significant challenges for deployment on resource-constrained platforms. While model compression techniques
like pruning have emerged as essential tools, traditional magnitude-based pruning
methods do not necessarily align with the true contribution of network components to task-specific performance. In this work, we present a novel explainabilitydriven layer-wise pruning framework specifically tailored for efficient object detection. Our approach leverages SHAP-based contribution analysis to quantify
layer importance through gradient-activation products, providing a data-driven
measure of functional contribution rather than relying solely on static weight
magnitudes. We conduct comprehensive experiments across diverse object detection architectures including ResNet-50, MobileNetV2, ShuffleNetV2, Faster RCNN, RetinaNet, and YOLOv8, evaluating performance on the Microsoft COCO
2017 validation set. Our results demonstrate that SHAP-based pruning consistently identifies different layers as least important compared to L1-norm methods,
leading to superior accuracy-efficiency trade-offs. Notably, for ShuffleNetV2,
our method achieves a 10% increase in inference speed while L1-pruning degrades performance by 13.7%. For RetinaNet, SHAP-pruning maintains baseline mAP exactly (0.151) with negligible impact on inference speed, while L1-
pruning sacrifices 1.3% mAP for a 6.2% speed increase. These findings highlight
the importance of data-driven layer importance assessment and demonstrate that
explainability-guided compression offers new directions for deploying advanced
DNN solutions on edge and resource-constrained platforms while preserving both
performance and model interpretability.
Primary Area: interpretability and explainable AI
Submission Number: 16518
Loading