Multitask Prompted Training Enables Zero-Shot Task Generalization

Victor Sanh; Albert Webson; Colin Raffel; Stephen Bach; Lintang Sutawika; Zaid Alyafeai; Antoine Chaffin; Arnaud Stiegler; Arun Raja; Manan Dey; M Saiful Bari; Canwen Xu; Urmish Thakker; Shanya Sharma Sharma; Eliza Szczechla; Taewoon Kim; Gunjan Chhablani; Nihal Nayak; Debajyoti Datta; Jonathan Chang; Mike Tian-Jian Jiang; Han Wang; Matteo Manica; Sheng Shen; Zheng Xin Yong; Harshit Pandey; Rachel Bawden; Thomas Wang; Trishala Neeraj; Jos Rozen; Abheesht Sharma; Andrea Santilli; Thibault Fevry; Jason Alan Fries; Ryan Teehan; Teven Le Scao; Stella Biderman; Leo Gao; Thomas Wolf; Alexander M Rush

Multitask Prompted Training Enables Zero-Shot Task Generalization

Published: 28 Jan 2022, Last Modified: 26 May 2025ICLR 2022 SpotlightReaders: Everyone

Abstract: Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models’ pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping any natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts with diverse wording. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model (Raffel et al., 2020; Lester et al., 2021) on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several datasets, often outperforming models 16× its size. Further, our model attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models 6× its size. All trained models are available at https://github.com/bigscience-workshop/t-zero, and all prompts are available at https://github.com/bigscience-workshop/promptsource.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 6 code implementations](https://www.catalyzex.com/paper/multitask-prompted-training-enables-zero-shot/code)

16 Replies

Loading