Understanding Generalization in Transformers: Error Bounds and Training Dynamics Under Benign and Harmful Overfitting
Keywords: Transformer, Benign overfiting, Feature learning theory, Generalization error bounds, Signal-to-noise ratio
TL;DR: We present generalization bounds for a two-layer Transformer under benign overfitting and harmful overfitting.
Abstract: Transformers serve as the foundational architecture for many successful large-scale models, demonstrating the ability to overfit the training data while maintaining strong generalization on unseen data, a phenomenon known as benign overfitting. However, existing research has not sufficiently explored generalization and training dynamics of transformers under benign overfitting. This paper addresses this gap by analyzing a two-layer transformer's training dynamics, convergence, and generalization under labeled noise. Specifically, we present generalization error bounds for benign and harmful overfitting under varying signal-to-noise ratios (SNR), where the training dynamics are categorized into three distinct stages, each with its corresponding error bounds. Additionally, we conduct extensive experiments to identify key factors in transformers that influence test losses. Our experimental results align closely with the theoretical predictions, validating our findings.
Supplementary Material: zip
Primary Area: learning theory
Submission Number: 18976
Loading