Capacity and Redundancy Trade-offs in Multi-Task Learning

Published: 28 Feb 2026, Last Modified: 04 Apr 2026CAO PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi task learning; Parameter efficient fine-tuning; Low rank adapters; Information theory
Abstract: In multi-task learning (MTL), jointly training on multiple tasks often produces negative transfer, where adding tasks degrades performance instead of improving it. We introduce a Capacity--Redundancy (CR) inequality showing that the total predictive information extractable from a shared latent is upper-bounded by the capacity of representation plus the redundancy among tasks. For related tasks, redundancy allows the shared information to support multiple predictions. For unrelated tasks, the bound certifies an unavoidable trade-off, explaining negative transfer and motivating a need for private task-specific features. We extend the CR framework to shared–private architectures in parameter-efficient finetuning approaches and derive implications for error lower bounds which can be a useful tool for practical design decisions.
Submission Number: 98
Loading