In-context Learning for Mixture of Linear Regression: Existence, Generalization and Training Dynamics

Yanhao Jin; Krishna Balasubramanian; Lifeng Lai

In-context Learning for Mixture of Linear Regression: Existence, Generalization and Training Dynamics

Yanhao Jin, Krishna Balasubramanian, Lifeng Lai

Published: 01 Nov 2025, Last Modified: 01 Nov 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: We investigate the in-context learning capabilities of transformers for the $d$-dimensional mixture of linear regression model, providing theoretical insights into their existence, generalization bounds, and training dynamics. Specifically, we prove that there exists a transformer capable of achieving a prediction error of order $\mathcal{O}(\sqrt{d/n})$ with high probability, where $n$ represents the training prompt size in the high signal-to-noise ratio (SNR) regime. Moreover, we derive in-context excess risk bounds of order $\mathcal{O}(L/\sqrt{B})$ for the case of two mixtures, where $B$ denotes the number of training prompts, and $L$ represents the number of attention layers. The dependence of $L$ on the SNR is explicitly characterized, differing between low and high SNR settings. We further analyze the training dynamics of transformers with single linear self-attention layers, demonstrating that, with appropriately initialized parameters, gradient flow optimization over the population mean square loss converges to a global optimum. Extensive simulations suggest that transformers perform well on this task, potentially outperforming other baselines, such as the Expectation-Maximization algorithm.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: We fixed some typos and added some comments/remarks based on the suggestion from reviewers. Package for camera ready version is used(\usepackage[accepted]{tmlr}). Acknowledgments in page 13 is added.

Supplementary Material: zip

Assigned Action Editor: ~Meisam_Razaviyayn1

Submission Number: 4807

Loading