Filter Independence-Aware Pruning: Efficient Neural Networks for On-Device AI

Jiali Wang, Hongxia Bie, Zhao Jing, Yichen Zhi, Yongkai Fan, Wentao Ma

Published: 01 Feb 2026, Last Modified: 07 Apr 2026ElectronicsEveryoneRevisionsCC BY-SA 4.0
Abstract: Filter pruning is an effective approach for improving the inference efficiency of neural networks and is particularly attractive for on-device artificial intelligence (AI) applications. However, many existing methods fail to accurately identify redundant filters due to limited modeling of inter-filter dependencies. A filter pruning method based on nuclear norm analysis is proposed to quantify filter independence and guide structured pruning. By analyzing the layer-wise distribution of independence scores, a principled trade-off between pruning rate and accuracy preservation is achieved. In most evaluation scenarios, the proposed method achieves 75–95% parameter reduction and 70–80% FLOPs reduction, while substantially higher compression ratios (up to 99%) can be obtained for more redundant network architectures, with consistent performance trends observed across multiple accuracy-related metrics. Furthermore, deployment on an RK3588 neural processing unit (NPU) demonstrates substantial reductions in memory consumption and inference latency, confirming the practical effectiveness of the method for mobile and edge AI applications.
Loading