CCAP: Cooperative Context Aware Pruning for Neural Network Model Compression

Li-Yun Wang, Zahid Akhtar

Published: 01 Jan 2021, Last Modified: 05 May 2023ISM 2021Readers: Everyone

Abstract: In this paper, we propose a new cross-domain model compression technique to yield a compact target model. We utilize a Cooperative Context-Aware Pruning (CCAP) module to produce sparse attention maps. They are then used to transmit the source models’ parameters to the target model precisely. We also leverage a weight-regular loss to minimize the difference between the source models’ and the target models’ parameters. Our quantitatively empirical evaluation shows that our CCAP module plus the weight-regular loss achieves lower model complexity without having serious performance decreasing.

0 Replies