Understanding in-context learning in transformers

Simone Rossi; Rui Yuan; Thomas Hannagan

Understanding in-context learning in transformers

Simone Rossi, Rui Yuan, Thomas Hannagan

Published: 16 Feb 2024, Last Modified: 28 Mar 2024BT@ICLR2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: transformers, in-context learning

Blogpost Url: https://iclr-blogposts.github.io/2024/blog/understanding-icl/

Abstract: We propose a critical review on the phenomenon of In-Context Learning (ICL) in transformer architectures. Focusing on the article *Transformers Learn In-Context by Gradient Descent* by J. von Oswald et al., published in ICML 2023 earlier this year, we provide detailed explanations and illustrations of the mechanisms involved. We also contribute novel analyses on ICL, discuss recent developments and we point to open questions in this area of research.

Ref Papers: https://openreview.net/forum?id=tHvXrFQma5

Id Of The Authors Of The Papers: ~Johannes_Von_Oswald1

Conflict Of Interest: No conflict of interest to report with authors of the paper under analysis.

Submission Number: 9

Loading