Towards Impartial Multi-task Learning

Liyang Liu; Yi Li; Zhanghui Kuang; Jing-Hao Xue; Yimin Chen; Wenming Yang; Qingmin Liao; Wayne Zhang

Towards Impartial Multi-task Learning

Liyang Liu, Yi Li, Zhanghui Kuang, Jing-Hao Xue, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang

Published: 12 Jan 2021, Last Modified: 05 May 2023ICLR 2021 PosterReaders: Everyone

Keywords: Multi-task Learning, Impartial Learning, Scene Understanding

Abstract: Multi-task learning (MTL) has been widely used in representation learning. However, naively training all tasks simultaneously may lead to the partial training issue, where specific tasks are trained more adequately than others. In this paper, we propose to learn multiple tasks impartially. Specifically, for the task-shared parameters, we optimize the scaling factors via a closed-form solution, such that the aggregated gradient (sum of raw gradients weighted by the scaling factors) has equal projections onto individual tasks. For the task-specific parameters, we dynamically weigh the task losses so that all of them are kept at a comparable scale. Further, we find the above gradient balance and loss balance are complementary and thus propose a hybrid balance method to further improve the performance. Our impartial multi-task learning (IMTL) can be end-to-end trained without any heuristic hyper-parameter tuning, and is general to be applied on all kinds of losses without any distribution assumption. Moreover, our IMTL can converge to similar results even when the task losses are designed to have different scales, and thus it is scale-invariant. We extensively evaluate our IMTL on the standard MTL benchmarks including Cityscapes, NYUv2 and CelebA. It outperforms existing loss weighting methods under the same experimental settings.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We propose an impartial multi-task learning method that treats all tasks equally without bias towards any task.

Data: [CelebA](https://paperswithcode.com/dataset/celeba), [Cityscapes](https://paperswithcode.com/dataset/cityscapes), [NYUv2](https://paperswithcode.com/dataset/nyuv2)

Code: [![Papers with Code](/images/pwc_icon.svg) 2 community implementations](https://paperswithcode.com/paper/?openreview=IMPnRXEWpvr)

8 Replies

Loading