Understanding in-context learning in transformers

Published: 16 Feb 2024, Last Modified: 28 Mar 2024BT@ICLR2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: transformers, in-context learning
Blogpost Url: https://iclr-blogposts.github.io/2024/blog/understanding-icl/
Abstract: We propose a critical review on the phenomenon of In-Context Learning (ICL) in transformer architectures. Focusing on the article *Transformers Learn In-Context by Gradient Descent* by J. von Oswald et al., published in ICML 2023 earlier this year, we provide detailed explanations and illustrations of the mechanisms involved. We also contribute novel analyses on ICL, discuss recent developments and we point to open questions in this area of research.
Ref Papers: https://openreview.net/forum?id=tHvXrFQma5
Id Of The Authors Of The Papers: ~Johannes_Von_Oswald1
Conflict Of Interest: No conflict of interest to report with authors of the paper under analysis.
Submission Number: 9
Loading