Dynamic Token Modulation and Expansion for Multi-Task Learning

25 Sept 2024 (modified: 14 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-Task Learning, Token Modulation and Expansion, Conflicting Gradients
Abstract: Multi-Task Learning (MTL) aims to minimize negative transfer within a shared network. Common strategies involve separating task-generic and task-specific representations and coordinating them to work together effectively within MTL frameworks. However, the absence of a clear rule for determining task-specific network components challenges the design of efficient MTL architectures. Our method tackles negative transfer by employing token-based network expansion and modulation without directly modifying predefined architectures, making it adaptable to any transformer-based MTL architectures. To evaluate negative transfer, we treat tokens as parameters, assessing gradient conflicts during backpropagation. Conflicts between tasks are analyzed by examining the token's range space and null space. Based on conflict types, we expand the network following rules. If task-specific gradients clash in the tokens' range space, we modulate existing tokens to align their task gradients. Conversely, if the gradients conflict in the null space of tokens, we add new task-specific tokens, spanning a new feature space. Our approach effectively boosts multi-task performance across various datasets by being integrated into previous state-of-the-art multi-task architectures.
Primary Area: learning theory
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4892
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview