LA-MTL: Latency-Aware Automated Multi-Task Learning

Published: 2025, Last Modified: 13 Nov 2025DAC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multi-Task Learning (MTL) aims to unify a variety of tasks into a single network for improved training and inference efficiency. This is particularly attractive for real-time applications that require simultaneous execution of multiple workloads in resource-constrained embedded environments. However, most MTL approaches focus on enhancing parameters efficiency and overall tasks metrics, often lacking explicit inference latency awareness in the optimization loop. The design space exploration should not compromise on the parameters efficiency or task accuracy objectives in order to meet latency requirements. To address this, we propose LA-MTL, an automated layer-level MTL policy search that incorporates a novel analytical latency factor (ALF). By accounting for local and global latencies during the MTL policy search, we derive solutions that balance task metrics, parameters efficiency and latency constraints. LA-MTL search on ResNet34 yields solutions with up to 50% lower latency on the Jetson AGX Orin while maintaining competitive metrics in semantic segmentation and depth estimation tasks with a +/-2 p.p., on the CityScapes dataset. Additionally, we achieve a superior parameters efficiency, surpassing the state-of-theart MTL parameters reduction by over 20 p.p. Experiments on benchmark datasets (CityScapes, NYUv2) demonstrate the effectiveness of our approach across various backbones including ResNet34, MobileNetV2, and MobileOne in its expanded form. Code is available at https://github.com/shamvbs/LA-MTL.1
Loading