MLPs Learn In-Context on Regression and Classification Tasks

Published: 22 Jan 2025, Last Modified: 25 Feb 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: In-context learning, relational reasoning, synthetic tasks, MLP, MLP-Mixer, Transformer
TL;DR: On a range of widely studied synthetic in-context learning tasks, we find that MLPs perform comparably with Transformers under the same compute budget.
Abstract: In-context learning (ICL), the remarkable ability to solve a task from only input exemplars, is often assumed to be a unique hallmark of Transformer models. By examining commonly employed synthetic ICL tasks, we demonstrate that multi-layer perceptrons (MLPs) can also learn in-context. Moreover, MLPs, and the closely related MLP-Mixer models, learn in-context comparably with Transformers under the same compute budget in this setting. We further show that MLPs outperform Transformers on a series of classical tasks from psychology designed to test relational reasoning, which are closely related to in-context classification. These results underscore a need for studying in-context learning beyond attention-based architectures, while also challenging prior arguments against MLPs' ability to solve relational tasks. Altogether, our results highlight the unexpected competence of MLPs in a synthetic setting, and support the growing interest in all-MLP alternatives to Transformer architectures. It remains unclear how MLPs perform against Transformers at scale on real-world tasks, and where a performance gap may originate. We encourage further exploration of these architectures in more complex settings to better understand the potential comparative advantage of attention-based schemes.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1336
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview