Deciphering and Optimizing Multi-Task Learning: a Random Matrix ApproachDownload PDF

Published: 12 Jan 2021, Last Modified: 05 May 2023ICLR 2021 SpotlightReaders: Everyone
Keywords: Transfer Learning, Multi Task Learning, Random Matrix Theory
Abstract: This article provides theoretical insights into the inner workings of multi-task and transfer learning methods, by studying the tractable least-square support vector machine multi-task learning (LS-SVM MTL) method, in the limit of large ($p$) and numerous ($n$) data. By a random matrix analysis applied to a Gaussian mixture data model, the performance of MTL LS-SVM is shown to converge, as $n,p\to\infty$, to a deterministic limit involving simple (small-dimensional) statistics of the data. We prove (i) that the standard MTL LS-SVM algorithm is in general strongly biased and may dramatically fail (to the point that individual single-task LS-SVMs may outperform the MTL approach, even for quite resembling tasks): our analysis provides a simple method to correct these biases, and that we reveal (ii) the sufficient statistics at play in the method, which can be efficiently estimated, even for quite small datasets. The latter result is exploited to automatically optimize the hyperparameters without resorting to any cross-validation procedure. Experiments on popular datasets demonstrate that our improved MTL LS-SVM method is computationally-efficient and outperforms sometimes much more elaborate state-of-the-art multi-task and transfer learning techniques.
One-sentence Summary: This paper provides a theoretical analysis of Multi Task Learning schemes for large dimensional data
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Supplementary Material: zip
Data: [Caltech-256](https://paperswithcode.com/dataset/caltech-256)
10 Replies

Loading