RotoGrad: Gradient Homogenization in Multi-Task Learning

Adrián Javaloy; Isabel Valera

RotoGrad: Gradient Homogenization in Multi-Task Learning

Adrián Javaloy, Isabel Valera

21 May 2021 (modified: 05 May 2023)NeurIPS 2021 SubmittedReaders: Everyone

Keywords: multi-task learning, gradient conflict, negative transfer

Abstract: Multi-task learning is being increasingly adopted in applications domains like computer vision and reinforcement learning. However, optimally exploiting its advantages remains a major challenge due to the effect of negative transfer. Previous works have tracked down this issue to the disparities in gradient magnitudes and directions across tasks, when optimizing the shared network parameters. While recent work has acknowledged that negative transfer is a two-fold problem, existing approaches fall short as they focus only on either homogenizing the gradient magnitude across tasks; or greedily change the gradient directions, overlooking future conflicts. In this work, we introduce RotoGrad, an algorithm that tackles negative transfer as a whole: it jointly homogenizes gradient magnitudes and directions, while ensuring training convergence. We show that RotoGrad outperforms competing methods in complex problems, including multi-label classification in CelebA and computer vision tasks in the NYUv2 dataset.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

TL;DR: We propose an algorithm to simultaneously homogenize gradient magnitudes and directions across tasks in MTL.

Supplementary Material: zip

15 Replies

Loading