Generate, Annotate, and Learn: Generative Models Advance Self-Training and Knowledge Distillation

Xuanli He; Islam Nassar; Jamie Ryan Kiros; Gholamreza Haffari; Mohammad Norouzi

Generate, Annotate, and Learn: Generative Models Advance Self-Training and Knowledge Distillation

Xuanli He, Islam Nassar, Jamie Ryan Kiros, Gholamreza Haffari, Mohammad Norouzi

Published: 28 Jan 2022, Last Modified: 22 Jun 2025ICLR 2022 SubmittedReaders: Everyone

Keywords: deep generative models, semi-supervised learning, knowledge distillation, large language models

Abstract: Semi-Supervised Learning (SSL) has seen success in many application domains, but this success often relies on the availability of task-specific unlabeled data. Knowledge distillation (KD) has enabled compressing deep networks, achieving the best results when distilling knowledge on fresh task-specific unlabeled examples. However, task-specific unlabeled data can be challenging to find, especially for NLP problems. We present a simple framework called "generate, annotate, and learn (GAL)" that uses unconditional language models to synthesize in-domain unlabeled data, helping advance SSL and KD on NLP and tabular tasks. To obtain strong task-specific generative models, we either fine-tune a large language model (LLM) on inputs from specific tasks, or prompt a LLM with a few input examples to generate more unlabeled examples. Then, we use existing classifiers to annotate generated unlabeled examples with pseudo labels, which are used as additional training data or as additional prompts. GAL improves prompt-based few-shot learning on several NLP tasks. It also yields a new state-of-the-art for 6-layer transformers on the GLUE leaderboard. Finally, self-training with GAL offers large gains on four tabular tasks from the UCI repository.

One-sentence Summary: We propose a framework, so-called, GAL to advance self-training, knowledge distillation and few-shot learning on NLP and tabular datasets.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/generate-annotate-and-learn-generative-models/code)

14 Replies

Loading