Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Neural neworks, Local error signals, BP-free learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We proposed a BP-free architecture that can be embedded in ResNet, trained with local error signals that achieve significantly lower errors than previous BP-free algorithms on MNIST, CIFAR-10, and ImageNet, and even surpass BP-enabled models.s.
Abstract: The collective behavior of a network with heterogeneous, resource-limited information processing units (e.g., group of fish, flock of birds, or network of neurons) demonstrates high self-organization and complexity. These emergent properties arise from simple interaction rules where certain individuals can exhibit leadership-like behavior and influence the collective activity of the group.
Driven by the natural collective ensembles, we introduce a \textit{worker} concept to artificial neural network (NN).
This NN structure contains workers that encompass one or more information processing units (e.g., neurons, filters, layers, or blocks of layers). Workers are either leaders or followers, and we train a leader-follower neural network (LFNN) by leveraging local error signals.
LFNN does not require backprobagation (BP) and global loss to achieve the best performance (we denote LFNN trained without BP as LFNN-$\ell$).
We investigate worker behavior and evaluate LFNN and LFNN-$\ell$ through extensive experimentation.
On small datasets such as MNIST and CIFAR-10, LFNN-$\ell$, trained with local error signals achieves lower error rates than previous BP-free algorithms and even surpasses BP-enabled baselines.
On ImageNet, LFNN-$\ell$ demonstrates superior scalability. It achieves higher accuracy than previous BP-free algorithms by a significant margin.
Furthermore, LFNN-$\ell$ can be conveniently embedded in classic convolutional NNs such as VGG and ResNet architectures.
Our experimental results show that LFNN-$\ell$ achieves at most 2x speedup compared to BP, and significantly outperforms models trained with end-to-end BP and other state-of-the-art BP-free methods in terms of accuracy on CIFAR-10, Tiny-ImageNet, and ImageNet.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4542
Loading