Customizable Combination of Parameter-Efficient Modules for Multi-Task Learning

Haowen Wang; Tao Sun; Congyun Jin; Yingbo Wang; Yibo Fan; Yunqi Xu; Yuliang Du; Cong Fan

Customizable Combination of Parameter-Efficient Modules for Multi-Task Learning

Haowen Wang, Tao Sun, Congyun Jin, Yingbo Wang, Yibo Fan, Yunqi Xu, Yuliang Du, Cong Fan

Published: 16 Jan 2024, Last Modified: 16 Apr 2024ICLR 2024 posterEveryoneRevisionsBibTeX

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Modular skill learning, Multi-task learning, Parameter-Efficient, Fine-Tuning

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: A novel paradigm of Parameter Efficient Fine-Tuning (PEFT) for multi-task learning, harnessing specialized and shared domain skills.

Abstract: Modular and composable transfer learning is an emerging direction in the field of Parameter Efficient Fine-Tuning, as it enables neural networks to better organize various aspects of knowledge, leading to improved cross-task generalization. In this paper, we introduce a novel approach Customized Polytropon ($\texttt{C-Poly}$) that combines task-common skills and task-specific skills, while the skill parameters being highly parameterized using low-rank techniques. Each task is associated with a customizable number of exclusive specialized skills and also benefits from skills shared with peer tasks. A skill assignment matrix is jointly learned. To evaluate our approach, we conducted extensive experiments on the Super-NaturalInstructions and the SuperGLUE benchmarks. Our findings demonstrate that $\texttt{C-Poly}$ outperforms fully-shared, task-specific, and skill-indistinguishable baselines, significantly enhancing the sample efficiency in multi-task learning scenarios.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Primary Area: generative models

Submission Number: 7467

Loading