Efficient and Quantization-Friendly Ternary Fourier Convolution Algorithms

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Deep Learning; Fast Convolution Algorithm; CNN; Winograd; Fourier Transform
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Fast convolution algorithms like Winograd and the Fourier transform are well-known for their substantial reduction in the multiplication complexity of Convolutional Neural Networks. However, when these methods are combined with model quantization, their inherently complex transformation matrices can introduce significant numerical errors, leading to a decrease in network accuracy. To address this challenge, we present a novel fast convolution algorithm that utilizes ternary matrices (coefficients containing only ±1 and 0) for input and weight transformations before multiplication, thus minimizing quantization errors. This approach is derived from the implementation of symbolic arithmetic on the Fourier transform to eliminate the involvement of irrational numbers. Then, we incorporate correction terms to convert ineffective circular convolution results into efficient ones, thereby enhancing algorithm efficiency. Additionally, we propose a corresponding post-training quantization method that requires only a few samples for calibrating network parameters and restoring accuracy without the heavy cost of retraining. Our algorithms achieve 3.68x, 4.89x, and 4.54x theoretical multiplication complexity reduction for 3x3, 5x5, and 7x7 convolutions, respectively. For models trained on the ImageNet dataset, our algorithms with the post-training method, demonstrate an accuracy drop of less than 0.2% under Int8 quantization, surpassing other approaches with similar multiplication reduction ratios.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3779
Loading