Approximate Clustering for Extracting Task Relationships in Multi-Instruction Tuning

Published: 28 Oct 2023, Last Modified: 10 Jun 2024Instruction Workshop @ NeurIPS 2023EveryoneRevisionsBibTeX
Keywords: Multitask learning; Clustering; Instruction fine-tuning
TL;DR: We develop an efficient and robust algorithm for multi-instruction tuning, along with an extensive collection of evaluation cases for assessing task grouping algorithms.
Abstract: The development of language models involves the evaluation of a broad range of learning tasks. Recent work has shown that by using carefully designed instructions to teach a large transformer model, they can be fine-tuned on a wide range of downstream tasks. However, when the number of instructions increases, they can negatively interfere with each other if trained together. Existing works have relied on domain expertise and manual inspection to construct multi-instruction sets, which can be time-consuming and difficult to scale. To address this challenge, this paper develops a clustering algorithm to find groups of similar tasks based on a given set of task affinity scores. This is an NP-hard problem, and conventional algorithms such as spectral and Llyod's clustering are sensitive to variations in the scale of task losses. Our algorithm instead uses a semidefinite relaxation to maximize the average density of clusters and then rounds the solution with a threshold. We adaptively build the clusters by gradually adding tasks so that the affinities only need to be computed in the existing clusters. Then, we construct an evaluation benchmark to assess task grouping algorithms with verified group structures. The evaluation set includes 63 cases, spanning multitask instruction tuning, multi-instruction tuning, and in-context learning of multiple functions. We validate our algorithm on this evaluation set by showing that it recovers the group structure found by an exhaustive search. We also show that our approach improves performance over multi-instruction and soft-prompt tuning by up to 6% on several sentence classification and structure-to-text generative tasks.
Submission Number: 48
Loading