Merging Well-Trained Deep CNN Models for Efficient Inference

Cheng-En Wu, Jia-Hong Lee, Timmy S. T. Wan, Yi-Ming Chan, Chu-Song Chen

2020 (modified: 22 Nov 2022)APSIPA 2020Readers: Everyone

Abstract: In signal processing applications, more than one tasks often have to be integrated into a system. Deep learning models (such as convolutional neural networks) of multiple purposes have to be executed simultaneously. When deploying multiple well-trained models to an application system, running them simultaneously is inefficient due to the collective loads of computation. Hence, merging the models into a more compact one is often required, so that they can be executed more efficiently on resource-limited devices. When deploying two or more well-trained deep neural-network models in the inference stage, we introduce an approach that fuses the models into a condensed model. The proposed approach consists of three phases: Filter Alignment, Shared-weight Initialization, and Model Calibration. It can merge well-trained feed-forward neural networks of the same architecture into a single network to reduce online storage and inference time. Experimental results show that our approach can improve both the run-time memory compression ratio and increase the computational speed in the execution.

0 Replies