Lessons from Identifiability for Understanding Large Language Models

Published: 04 Mar 2024, Last Modified: 04 May 2024ICLR 2024 Workshop MEFoMo Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Autoregressive Models, Identifiability, Inductive Bias, Saturation Regime
TL;DR: We advocate for studying emergent properties in LLMs assuming statistical generalization combined with inductive biases
Abstract: Many interesting properties emerge in LLMs, including rule extrapolation, in-context learning, and data-efficient fine-tunability. We demonstrate that good statistical generalization alone cannot explain these phenomena due to the inherent non-identifiability of autoregressive (AR) probabilistic models. Indeed, models zero or near-zero KL divergence apart---thus, equivalent test loss---can exhibit markedly different behaviours. We illustrate the practical implications for AR LLMs regarding three types of non-identifiability: (1) the non-identifiability of zero-shot rule extrapolation; (2) the approximate non-identifiability of in-context learning; and (3) the non-identifiability of fine-tunability. We hypothesize these important properties in LLMs are induced by inductive biases.
Submission Number: 81
Loading