Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

Yehuda Dar; Richard Baraniuk

Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

Yehuda Dar, Richard Baraniuk

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: theory of overparameterized learning, statistics, double descent, transfer learning, regression

Abstract: We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task. We analytically characterize the generalization error of the target task in terms of the salient factors in the transfer learning architecture, i.e., the number of examples available, the number of (free) parameters in each of the tasks, the number of parameters transferred from the source to target task, and the correlation between the two tasks. Our non-asymptotic analysis shows that the generalization error of the target task follows a two-dimensional double descent trend (with respect to the number of free parameters in each of the tasks) that is controlled by the transfer learning factors. Our analysis points to specific cases where the transfer of parameters is beneficial.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: A new theoretical framework for transfer learning in overparameterized, linear regression settings.

Reviewed Version (pdf): https://openreview.net/references/pdf?id=NzRCA1id7r

1 Reply

Loading