Abstract: In this paper, we propose a novel highly parallel deep ensemble learning, which leads to highly compact and parallel deep neural networks. The main idea is to \textit{split data into spectral subsets; train subnetworks separately; and ensemble the output results in the inference stage}. The proposed method has parallel branches with each branch being an independent neural network trained using one spectral subset of the training data. It ensembles the outputs of the parallel branches to produce an overall network with substantially stronger generalization capability. It can also scale up the model to the large scale dataset with limited memory. The joint data/model parallel method is amiable for GPU implementation. Due to the reduced size of inputs, the proposed spectral tensor network exhibits an inherent network compression, which leads to the acceleration of training process. We evaluate the proposed spectral tensor networks on the MNIST, CIFAR-10 and ImageNet data sets, to highlight that they simultaneously achieve network compression, reduction in computation and parallel speedup. Specifically, on both ImageNet-1K and ImageNet-21K dataset, our proposed AlexNet-spectral, VGG-16-spectral, ResNet-34-spectral, CycleMLP-spectral and MobileVit-spectral networks achieve a comparable performance with the vanila ones, and enjoy up to $4 \times$ compression ratio and $1.5 \times$ speedups.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
Supplementary Material: zip
4 Replies
Loading