How Capable Can a Transformer Become? A Study on Synthetic, Interpretable Tasks

Rahul Ramesh; Mikail Khona; Robert P. Dick; Hidenori Tanaka; Ekdeep Singh Lubana

How Capable Can a Transformer Become? A Study on Synthetic, Interpretable Tasks

Rahul Ramesh, Mikail Khona, Robert P. Dick, Hidenori Tanaka, Ekdeep Singh Lubana

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: visualization or interpretation of learned representations

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Transformers, Capabilities, Mechanistic interpretability, Synthetic task

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We study the ability of transformers to compose capabilities systematically on synthetic tasks

Abstract: Transformers trained on huge text corpora exhibit a remarkable set of capabilities, e.g., performing simple logical operations. Given the inherent compositional nature of language, one can expect the model to learn to compose these capabilities, potentially yielding a combinatorial explosion of what operations it can perform on an input. Motivated by the above, we aim to assess in this paper “how capable can a transformer become?”. Specifically, we train Transformer models on a data-generating process that involves compositions of a set of well-defined monolithic capabilities. Through a series of extensive and systematic experiments on this data-generating process, we show that: (1) Transformers can learn compositional structures from the training data and generalize to exponentially or even combinatorially many functions; (2) Composing functions by generating intermediate outputs is more effective at generalizing to unseen compositions, compared to generating no intermediate outputs; (3) The training data has a significant impact on the model’s ability to compose unseen combinations of functions; (4) The attention layers in the latter half of the Transformer seem critical to compositionality.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8467

Loading