Meta- (out-of-context) learning in neural networks

Dmitrii Krasheninnikov; Egor Krasheninnikov; Bruno Mlodozeniec; David Krueger

Meta- (out-of-context) learning in neural networks

Dmitrii Krasheninnikov, Egor Krasheninnikov, Bruno Mlodozeniec, David Krueger

Published: 01 Nov 2023, Last Modified: 12 Dec 2023R0-FoMo PosterEveryoneRevisionsBibTeX

Keywords: LLMs, large language models, in-context learning, meta-learning, world models, internalization, consistency, learning factual associations

TL;DR: Our experiments suggest that large language models may better internalize true-seeming statements, or text from authoritative sources, compared to text that looks to be from an unreliable-seeming source.

Abstract: Brown et al. (2020) famously introduced the phenomenon of in-context learning in large language models (LLMs). We establish the existence of a phenomenon we call **meta-out-of-context learning (meta-OCL)** via carefully designed synthetic experiments with LLMs. Our results suggest that meta-OCL leads LLMs to more readily “internalize” the semantic content of text that is, *or appears to be*, broadly useful (such as true statements, or text from authoritative sources) and use it in appropriate circumstances. We further demonstrate meta-OCL in a synthetic computer vision setting, and propose two hypotheses for the emergence of meta-OCL: one relying on the way models store knowledge in their parameters, and another suggesting that the implicit gradient alignment bias of gradient-descent-based optimizers may be responsible. Finally, we reflect on what our results might imply about capabilities of future AI systems, and discuss potential risks. Our code is available at https://github.com/krasheninnikov/internalization.

Submission Number: 29

Loading