An old dog can learn (some) new tricks: A tale of a three-decade old architecture

Grigorios Chrysos; Yixin Cheng; Jiankang Deng; Volkan Cevher

An old dog can learn (some) new tricks: A tale of a three-decade old architecture

Grigorios Chrysos, Yixin Cheng, Jiankang Deng, Volkan Cevher

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Neural network architecture, network design

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We attempt to rejuvenate a 30-year-old architecture and improve its performance to bring it closer to deep neural networks

Abstract: Designing novel architectures often involves combining or extending familiar components such as convolutions and attention modules. However, this approach can obscure the fundamental design principles as the focus is usually on the entire architecture. Instead, this paper takes an unconventional approach, attempting to rejuvenate an old architecture with modern tools and techniques. Our primary objective is to explore whether a 30-year-old architecture can compete with contemporary models, when equipped with modern tools. Through experiments spanning image recognition datasets, we aim to understand what aspects of the architecture contribute to its performance. We find that while an ensemble of ingredients bears significance in achieving commendable performance, only a few pivotal components have a large impact. We contend that our discoveries offer valuable insights for creating cutting-edge architectures.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7045

Loading