Online Knowledge Distillation for Multi-task LearningDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 12 May 2023WACV 2023Readers: Everyone
Abstract: Multi-task learning (MTL) has found wide application in computer vision tasks. We train a backbone network to learn a shared representation for different tasks such as semantic segmentation, depth- and normal estimation. In many cases negative transfer, i.e. impaired performance in the target domain, causes the MTL accuracy to be lower than training the corresponding single-task networks. To mitigate this issue, we propose an online knowledge distillation method, where single-task networks are trained simultaneously with the MTL network to guide the optimization process. We propose selectively training layers for each task using an adaptive feature distillation (AFD) loss with an online task weighting (OTW) scheme. This task-wise feature distillation enables the MTL network to be trained in a similar way to the single-task networks. On the NYUv2 and Cityscapes datasets we show improvements over a baseline MTL model by 6.22% and 9.19%, respectively, outperforming recent MTL methods. We validate the design choices in ablative experiments, including the use of online task weighting and the adaptive feature distillation loss.
0 Replies

Loading