Abstract: The ability of language models to learn a task from a few examples in context has generated substantial
interest. Here, we provide a perspective that situates this type of supervised few-shot learning within a
much broader spectrum of meta-learned in-context learning. Indeed, we suggest that any distribution of
sequences in which context non-trivially decreases loss on subsequent predictions can be interpreted
as eliciting a kind of in-context learning. We suggest that this perspective helps to unify the broad set
of in-context abilities that language models exhibit—such as adapting to tasks from instructions or
role play, or extrapolating time series. This perspective also sheds light on potential roots of in-context
learning in lower-level processing of linguistic dependencies (e.g. coreference or parallel structures).
Finally, taking this perspective highlights the importance of generalization, which we suggest can be
studied along several dimensions: not only the ability to learn something novel, but also flexibility in
learning from different presentations, and in applying what is learned. We discuss broader connections
to past literature in meta-learning and goal-conditioned agents, and other perspectives on learning and
adaptation. We close by suggesting that research on in-context learning should consider this broader
spectrum of in-context capabilities and types of generalization
Loading