Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners

Jihyeon Lee; Dain Kim; Doohae Jung; Boseop Kim; Kyoung-Woon On

Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners

Jihyeon Lee, Dain Kim, Doohae Jung, Boseop Kim, Kyoung-Woon On

Published: 10 Jul 2024, Last Modified: 26 Aug 2024COLMEveryoneRevisionsBibTeXCC BY 4.0

Research Area: Science of LMs, Learning algorithms for LMs, Inference algorithms for LMs

Keywords: Encoder-Decoder Model, In-context Learning, Few-shot Learning

TL;DR: Seq2seq models, with objective-aligned prompting and fusion-based methods, exhibit promising few-shot learning capabilities across diverse tasks, surpassing larger decoder-only models.

Abstract: In-context learning, which offers substantial advantages over fine-tuning, is predominantly observed in decoder-only models, while encoder-decoder (i.e., seq2seq) models excel in methods that rely on weight updates. Recently, a few studies have demonstrated the feasibility of few-shot learning with seq2seq models; however, this has been limited to tasks that align well with the seq2seq architecture, such as summarization and translation. Inspired by these initial studies, we provide a first-ever extensive experiment comparing the in-context few-shot learning capabilities of decoder-only and encoder-decoder models on a broad range of tasks. Furthermore, we propose two methods to more effectively elicit in-context learning ability in seq2seq models: objective-aligned prompting and a fusion-based approach. Remarkably, our approach outperforms a decoder-only model that is six times larger and exhibits significant performance improvements compared to conventional seq2seq models across a variety of settings. We posit that, with the right configuration and prompt design, seq2seq models can be highly effective few-shot learners for a wide spectrum of applications.

Supplementary Material: zip

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 376

Loading