TL;DR: We argue that Transformers have the potential to achieve artificial general intelligence (AGI).
Abstract: As large language models (LLMs) based on the Transformer architecture continue to achieve impressive performance across diverse tasks, this paper explores whether Transformers can ultimately achieve artificial general intelligence (AGI). We argue that Transformers have significant potential to achieve AGI, supported by the following insights and arguments. (1) A Transformer is expressive enough to simulate a programmable computer equipped with random number generators and, in particular, to execute programs for meta-tasks such as algorithm design. (2) By the Extended Church-Turing thesis, if some realistic intelligence system (say, a human with pencil and paper) achieves AGI, then in principle a single Transformer can replicate this capability; Besides, we suggest that Transformers are well-suited to approximate human intelligence, because they effectively integrate knowledge and functions represented in network form (e.g. pattern recognition) with logic reasoning abilities. (3) We argue that Transformers offer a promising practical approximation of Hutter's AIXI agent, which is an ideal construction to achieve AGI but is uncomputable.
Primary Area: Model Understanding, Explainability, Interpretability, and Trust
Keywords: Transformer, Turing machine, Artificial General Intelligence, The extended Church-Turing Hypothesis, Universal search, Universal Induction, AIXI
Submission Number: 250
Loading