Large Neural Networks at a Fraction

Published: 03 Nov 2023, Last Modified: 28 Dec 2023NLDL 2024EveryoneRevisionsBibTeX
Keywords: image classification, lottery ticket hypothesis, neural networks, pruning
TL;DR: Experiments on the performance of quaternion models in extreme pruning conditions with larger models and larger datasets.
Abstract: Large-scale deep learning models are known for their large amount of parameters, weighing down on the computational resources. The core of the Lottery Ticket Hypothesis showed us the potential of pruning to reduce such parameters without a significant drop in accuracy. Quaternion neural networks achieve comparable accuracy to equivalent real-valued networks on multi-dimensional prediction tasks. In our work, we implement pruning on real and quaternion-valued implementations of large-scale networks in the task of image recognition. For instance, our implementation of the ResNet-101 architecture on the CIFAR-100 and ImageNet64x64 datasets resulted in pruned quaternion models outperforming their real-valued counterparts by 4% and 7% in accuracy at sparsities of about 6% and 0.4%, respectively. We also got quaternion implementations of ResNet-101 and ResNet-152 on CIFAR-100 with steady Lottery tickets, whereas the Real counterpart failed to train at the same sparsity. Our experiments show that the pruned quaternion implementations perform better at higher sparsity than the corresponding real-valued counterpart, even in some larger neural networks.
Git: https://github.com/smlab-niser/quatLT23
Submission Number: 49
Loading