Emergent Abilities of Large Language Models

Jason Wei; Yi Tay; Rishi Bommasani; Colin Raffel; Barret Zoph; Sebastian Borgeaud; Dani Yogatama; Maarten Bosma; Denny Zhou; Donald Metzler; Ed H. Chi; Tatsunori Hashimoto; Oriol Vinyals; Percy Liang; Jeff Dean; William Fedus

Emergent Abilities of Large Language Models

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus

Published: 31 Aug 2022, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence raises the question of whether additional scaling could potentially further expand the range of capabilities of language models.

Certifications: Survey Certification

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Camera ready

Assigned Action Editor: ~Karthik_R_Narasimhan1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 209

Loading