Keywords: In-Context Learning, Privileged Information, Copyright, Sample-Efficiency, Knowledge Internalization, Multi-step reasoning
TL;DR: We formalize Context-Enhanced Learning, a paradigm where LLMs leverage context without direct gradient updates, achieving exponential sample efficiency gains over standard autoregressive training when the model has ICL capabilities
Abstract: We formalize a new concept for LLMs, **context-enhanced learning**. It involves standard gradient-based learning on text except that the context is enhanced with additional data on which no auto-regressive gradients are computed. This setting is a gradient-based analog of usual in-context learning (ICL) and appears in some recent works.
Using a multi-step reasoning task, we prove in a simplified setting that context-enhanced learning can be **exponentially more sample-efficient** than standard learning when the model is capable of ICL. At a mechanistic level, we find that the benefit of context-enhancement arises from a more accurate gradient learning signal. We also experimentally demonstrate that **it appears hard to detect or recover learning materials that were used in the context during training**. This may have implications for data security as well as copyright.
Submission Number: 70
Loading