Abstract: Neural-based multi-task learning has been successfully used in
many real-world large-scale applications such as recommendation
systems. For example, in movie recommendations, beyond providing users movies which they tend to purchase and watch, the system
might also optimize for users liking the movies afterwards. With
multi-task learning, we aim to build a single model that learns these
multiple goals and tasks simultaneously. However, the prediction
quality of commonly used multi-task models is often sensitive to the
relationships between tasks. It is therefore important to study the
modeling tradeos between task-specic objectives and inter-task
relationships.
In this work, we propose a novel multi-task learning approach,
Multi-gate Mixture-of-Experts (MMoE), which explicitly learns
to model task relationships from data. We adapt the Mixture-ofExperts (MoE) structure to multi-task learning by sharing the expert
submodels across all tasks, while also having a gating network
trained to optimize each task. To validate our approach on data with
dierent levels of task relatedness, we rst apply it to a synthetic
dataset where we control the task relatedness. We show that the
proposed approach performs better than baseline methods when
the tasks are less related. We also show that the MMoE structure
results in an additional trainability benet, depending on dierent
levels of randomness in the training data and model initialization.
Furthermore, we demonstrate the performance improvements by
MMoE on real tasks including a binary classication benchmark,
and a large-scale content recommendation system at Google.
0 Replies
Loading