Lessons from Identifiability for Understanding Large Language Models

Patrik Reizinger; Szilvia Ujváry; Anna Mészáros; Anna Kerekes; Wieland Brendel; Ferenc Huszár

Lessons from Identifiability for Understanding Large Language Models

Patrik Reizinger, Szilvia Ujváry, Anna Mészáros, Anna Kerekes, Wieland Brendel, Ferenc Huszár

Published: 04 Mar 2024, Last Modified: 04 May 2024ICLR 2024 Workshop MEFoMo Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Autoregressive Models, Identifiability, Inductive Bias, Saturation Regime

TL;DR: We advocate for studying emergent properties in LLMs assuming statistical generalization combined with inductive biases

Abstract: Many interesting properties emerge in LLMs, including rule extrapolation, in-context learning, and data-efficient fine-tunability. We demonstrate that good statistical generalization alone cannot explain these phenomena due to the inherent non-identifiability of autoregressive (AR) probabilistic models. Indeed, models zero or near-zero KL divergence apart---thus, equivalent test loss---can exhibit markedly different behaviours. We illustrate the practical implications for AR LLMs regarding three types of non-identifiability: (1) the non-identifiability of zero-shot rule extrapolation; (2) the approximate non-identifiability of in-context learning; and (3) the non-identifiability of fine-tunability. We hypothesize these important properties in LLMs are induced by inductive biases.

Submission Number: 81

Loading