Keywords: in-context learning, large language models, representations
TL;DR: Learning from demonstrations and the choice of label set representation independently affect in-context learning performance.
Abstract: In-context learning (ICL) is the ability of a large language model (LLM) to learn a new task from a few demonstrations presented as part of the context. Prior work has attributed much of ICL’s success to the representation of in-context demonstrations, particularly to the choice of labels in classification tasks. At the same time, evidence for ICL’s learning capacity, i.e., whether additional demonstrations improve performance, has been mixed, and ICL is often thought to occur only under specific conditions. The interaction between representation and learning in ICL remains underexplored.
We hypothesize that these two aspects influence ICL performance in distinct ways: the representation of demonstrations determines the baseline accuracy of ICL, while learning from additional demonstrations improves performance on top of this baseline. We test this hypothesis by developing an optimization algorithm that enumerates label sets with varying semantic relevance, and performing ICL with varying numbers of demonstrations for each label set. We observe that learning occurs regardless of label set quality, although its efficiency, measured by the slope of improvement over demonstrations, depends on both label set quality and the parameter count of the underlying language model. Despite the emergence of learning, the relative accuracy of different label sets is largely preserved throughout learning, confirming our hypothesis.
Our results reveal a previously underexplored aspect of ICL: the distinct roles of representation and learning in determining ICL performance.
Submission Number: 128
Loading