Implicit Bayesian Inference is An Insufficient Explanation of Language Model Behaviour in Compositional Tasks

Szilvia Ujváry; Anna Mészáros; Wieland Brendel; Patrik Reizinger; Ferenc Huszár

Implicit Bayesian Inference is An Insufficient Explanation of Language Model Behaviour in Compositional Tasks

Szilvia Ujváry, Anna Mészáros, Wieland Brendel, Patrik Reizinger, Ferenc Huszár

Published: 06 Mar 2025, Last Modified: 03 Apr 2025ICLR 2025 DeLTa Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 8 pages)

Keywords: language models, OOD generalization, implicit Bayesian inference, compositional generalization

TL;DR: We demonstrate experimentally that Transformers pre-trained for implicit Bayesian inference can often transcend this behaviour in OOD settings, especially in compositional tasks.

Abstract: Apparently rational behaviors of autoregressive LLMs, such as in-context learning, have been attributed to implicit Bayesian inference (IBI): since training data is best explained as a mixture, the optimal next-token-predictor learns to implicitly infer latent concepts and completes prompts consistently with Bayesian inference. While the optimal strategy in-distribution, Bayesian inference is generally suboptimal on out-of-distribution (OOD) prompts due to model misspecification. As model behavior on OOD prompts is only weakly constrained by pretraining, it is not guaranteed that Bayesian behavior is extrapolated OOD. Our work investigates with small-scale experiments the degree to which Bayesian inference remains a good model of LM behavior on OOD prompts. We report two findings: (1) Transformers are less prone to collapsing into a single mixture component than Bayesian inference. Like tempered Bayesian inference, this may be advantageous under model misspecification. (2) Transformers can generalize compositionally, even when the Bayes posterior is undefined. We conclude that autoregressive LMs can display rational-looking behavior that cannot be explained as any form of generalized Bayesian inference using only the training data.

Submission Number: 49

Loading