Adaptive Regularization for Large-Scale Sparse Feature Embedding Models

Mang Li; Wei Lyu

Adaptive Regularization for Large-Scale Sparse Feature Embedding Models

Mang Li, Wei Lyu

Published: 26 Jan 2026, Last Modified: 11 Apr 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: adaptive regularization, CTR estimation, large-scale sparse feature, optimization, one-epoch overfitting

Abstract: The one-epoch overfitting problem has drawn widespread attention, especially in CTR and CVR estimation models in search, advertising, and recommendation domains. These models which rely heavily on large-scale sparse categorical features, often suffer a significant decline in performance when trained for multiple epochs. Although recent studies have proposed heuristic solutions, the fundamental cause of this phenomenon remains unclear. In this work, we present a theoretical explanation grounded in Rademacher complexity, supported by empirical experiments, to explain why overfitting occurs in models with large-scale sparse categorical features. Based on this analysis, we propose a regularization method that constrains the norm budget of embedding layers adaptively. Our approach not only prevents the severe performance degradation observed during multi-epoch training, but also improves model performance within a single epoch. This method has already been deployed in online production systems.

Primary Area: optimization

Submission Number: 9479

Loading