All Birds with One Stone: Multi-task Learning for Inference with One Forward PassDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Task-specific fine-tuning of pre-trained language models like Transformers has shown their effectiveness in various NLP tasks. To achieve better storage efficiency and model performance, Multi-task Learning (MTL) has been studied to share model parameters and utilize knowledge transfer between tasks. However, in real applications where enormous numbers of tasks (e.g., large sets of labels to be classified) need to be conducted on a large corpus, the inference efficiency is still hindered by the number of tasks.For a document with N sets of labels to be predicted, recent MTL methods with adaptive modules or prompts need to encode the input data N times to extract the hidden representation needed for the tasks. Notice that the hidden representation is not sharable between tasks, as task-specific features are extracted at very bottom layers in the Transformer. In this paper, we seek to maintain the computational efficiency of only requiring one forward pass for a document to get a generalized feature for all N tasks, without sacrificing overall model performance.We design a prompt-sharing module to let the model take all tasks into considerations and output N heads simultaneously. We also design a dynamic task scheduling module to sample tasks according to their training progress. In our evaluation, we show that our method is able to outperform previous MTL state-of-the-arts and single task fine-tuning by 0.4-1.5% on GLUE benchmark dataset. We also perform comprehensive module analysis to demonstrate the effectiveness and robustness of our method.
0 Replies

Loading