Explicit Knowledge Factorization Meets In-Context Learning: What Do We Gain?

Sarthak Mittal; Eric Elmoznino; Leo Gagnon; Sangnie Bhardwaj; Dhanya Sridhar; Guillaume Lajoie

Explicit Knowledge Factorization Meets In-Context Learning: What Do We Gain?

Sarthak Mittal, Eric Elmoznino, Leo Gagnon, Sangnie Bhardwaj, Dhanya Sridhar, Guillaume Lajoie

Published: 05 Mar 2024, Last Modified: 08 May 2024ICLR 2024 R2-FM Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: in-context learning, inductive biases, transformer

TL;DR: We modify the standard transformer architecture for in-context learning to allow learning the right latent variables for different tasks, leading to more interpretable solutions and allowing the possibility of making interventions.

Abstract: Transformer models have shown considerable success in modeling predictive problems in diverse domains. It has been shown that they can efficiently learn in-context, i.e., solve new tasks without any further training when provided some examples as context (ICL). While first observed in language, consequent studies show that ICL also generalizes to a variety of algorithmic tasks. Recent research highlights that transformers might be implicitly modeling the posterior predictive distribution over latent variables which are important to solve different tasks. However, ICL diverges from standard Bayesian methods as it foregoes defining explicit latent variable model and consequently doing inference on it, in favor of an implicit mechanism. This begs a natural question: is there any benefit in explicitly factorizing knowledge or are we better off letting the model implicitly decide an appropriate solution space. We conduct a thorough analysis to uncover both the advantages and limitations of the current ICL setting and show that models which explicitly factorize knowledge can be more readily augmented with inductive biases which significantly boosts performance when domain knowledge is present.

Submission Number: 61

Loading