CNN Compression and Search Using Set Transformations with Width Modifiers on Network ArchitecturesDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: cnn, compression, efficient search, sets, embedded systems
TL;DR: convnet compression that is fast, not resource hungry and uses width modifiers applied with a new twist.
Abstract: We propose a new approach, based on discrete filter pruning, to adapt off-the-shelf models into an embedded environment. Importantly, we circumvent the usually prohibitive costs of model compression. Our method, Structured Coarse Block Pruning (SCBP), prunes whole CNN kernels using width modifiers applied to a novel transformation of convlayers into superblocks. SCBP uses set representations to construct a rudimentary search to provide candidate networks. To test our approach, the original ResNet architectures serve as the baseline and also provide the 'seeds' for our candidate search. The search produces a configurable number of compressed (derived) models. These derived models are often ~20\% faster and ~50\% smaller than their unmodified counterparts. At the expense of accuracy, the size can become even smaller and the inference latency lowered even further. The unique SCBP transformations yield many new model variants, each with their own trade-offs, and does not require GPU clusters or expert humans for training and design.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)
4 Replies

Loading