An old dog can learn (some) new tricks: A tale of a three-decade old architecture

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Neural network architecture, network design
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We attempt to rejuvenate a 30-year-old architecture and improve its performance to bring it closer to deep neural networks
Abstract: Designing novel architectures often involves combining or extending familiar components such as convolutions and attention modules. However, this approach can obscure the fundamental design principles as the focus is usually on the entire architecture. Instead, this paper takes an unconventional approach, attempting to rejuvenate an old architecture with modern tools and techniques. Our primary objective is to explore whether a 30-year-old architecture can compete with contemporary models, when equipped with modern tools. Through experiments spanning image recognition datasets, we aim to understand what aspects of the architecture contribute to its performance. We find that while an ensemble of ingredients bears significance in achieving commendable performance, only a few pivotal components have a large impact. We contend that our discoveries offer valuable insights for creating cutting-edge architectures.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7045
Loading