Keywords: ridge regression, ensembling methods
TL;DR: We derive learning curves for feature-subsampled ridge ensembles and show that structural heterogeneity acts as an implicit regularizer.
Abstract: Feature bagging is a well-established ensembling method which aims to reduce
prediction variance by combining predictions of many estimators trained on subsets
or projections of features. Here, we develop a theory of feature-bagging in noisy
least-squares ridge ensembles and simplify the resulting learning curves in the special
case of equicorrelated data. Using analytical learning curves, we demonstrate
that subsampling shifts the double-descent peak of a linear predictor. This leads
us to introduce heterogeneous feature ensembling, with estimators built on varying
numbers of feature dimensions, as a computationally efficient method to mitigate
double-descent. Then, we compare the performance of a feature-subsampling
ensemble to a single linear predictor, describing a trade-off between noise amplification
due to subsampling and noise reduction due to ensembling. Our qualitative
insights carry over to linear classifiers applied to image classification tasks with
realistic datasets constructed using a state-of-the-art deep learning feature map.
Submission Number: 9010
Loading