On the Power of Multitask Representation Learning with Gradient Descent

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: representation learning, multi-task learning, gradient descent, generalization
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Representation learning, particularly multi-task representation learning, has gained widespread popularity in various deep learning applications, ranging from computer vision to natural language processing, due to its remarkable generalization performance. Despite its growing use, our understanding of the underlying mechanisms remains limited. In this paper, we provide a theoretical analysis elucidating why multi-task representation learning outperforms its single-task counterpart in scenarios involving over-parameterized two-layer convolutional neural networks trained by gradient descent. Our analysis is based on a data model that encompasses both task-shared and task-specific features, a setting commonly encountered in real-world applications. We also present experiments on synthetic and real-world data to illustrate and validate our theoretical findings.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8738
Loading