Model Compression for IoT Applications in Industry 4.0 via Multiscale Knowledge Transfer

Shipeng Fu, Zhen Li, Kai Liu, Sadia Din, Muhammad Imran, Xiaomin Yang

Published: 01 Jan 2020, Last Modified: 15 May 2023IEEE Trans. Ind. Informatics 2020Readers: Everyone

Abstract: Recently, Industry 4.0 has attracted much attention. It has close relations with the Internet of Things (IoT). On the other hand, convolutional neural networks (CNNs) have shown promising performance in many foundational services of the IoT applications. For the IoT applications with high-speed data streams and the requirement of time-sensitive actions, fast processing is demanded on small-scale platforms or even on IoT devices themselves. Therefore, it is inappropriate to employ cumbersome CNNs in IoT applications, making the study of model compression necessary. In knowledge transfer, it is common to employ a deep, well-trained network, called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">teacher , to guide a shallow, untrained network, called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">student , to have better performance. Previous works have made many attempts to transfer single-scale knowledge from <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">teacher to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">student , leading to degradation of generalization ability. In this article, we introduce multiscale representations to knowledge transfer, which facilitates the generalization ability of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">student . We divide <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">student and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">teacher into several stages. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Student learns from multiscale knowledge provided by <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">teacher at the end of each stage. Extensive experiments demonstrate the effectiveness of our proposed method both on image classification and on single image super-resolution. The huge performance gap between <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">student and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">teacher is significantly narrowed down by our proposed method, making <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">student suitable for IoT applications.

0 Replies