- Keywords: multi-task learning, neural architecture search, multi-task architecture search
- TL;DR: automatic search for multi-task architectures that reduce per-task feature use
- Abstract: Multi-task learning promises to use less data, parameters, and time than training separate single-task models. But realizing these benefits in practice is challenging. In particular, it is difficult to define a suitable architecture that has enough capacity to support many tasks while not requiring excessive compute for each individual task. There are difficult trade-offs when deciding how to allocate parameters and layers across a large set of tasks. To address this, we propose a method for automatically searching over multi-task architectures that accounts for resource constraints. We define a parameterization of feature sharing strategies for effective coverage and sampling of architectures. We also present a method for quick evaluation of such architectures with feature distillation. Together these contributions allow us to quickly optimize for parameter-efficient multi-task models. We benchmark on Visual Decathlon, demonstrating that we can automatically search for and identify architectures that effectively make trade-offs between task resource requirements while maintaining a high level of final performance.