Keywords: model compression, hardware aware
TL;DR: efficient model compression using parameter sharing tuned to underlying hardware and algorithm implementations.
Abstract: Advancements in deep learning are often associated with increasing model sizes.
Training and deploying large models require sophisticated hardware and incur
significantly higher costs. Thus, model compression is a widely explored approach
to solving the problem. However, SOTA techniques fall short in one or more
desirable aspects of compression - for instance, pruning does not reduce memory
for training, quantization can only provide up to $32\times$ compression, HashedNet
is cache-inefficient, etc. This paper proposes a model-agnostic, cache-friendly,
and hardware-aware model compression approach: Random Operation Access
Specific Tile (ROAST) hashing. ROAST collapses the parameters by clubbing them
through a lightweight mapping. While clubbing these parameters, ROAST utilizes
cache hierarchies by aligning the memory access pattern with the parameter access
pattern. ROAST is up to $\sim 25 \times$ faster to train and $\sim 50 \times$ faster to infer than the
popular parameter sharing method HashedNet. Additionally, ROAST introduces
global weight sharing, which is empirically and theoretically superior to local
weight sharing in HashedNet, and can be of independent interest. With ROAST, we
can efficiently train and deploy the model using a much smaller memory footprint
($\sim 10 \times - 100 \times$ lesser) in text and image classification tasks
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: General Machine Learning (ie none of the above)
Supplementary Material: zip
14 Replies
Loading