Weighted Training for Cross-Task Learning

Shuxiao Chen; Koby Crammer; Hangfeng He; Dan Roth; Weijie J Su

Weighted Training for Cross-Task Learning

Shuxiao Chen, Koby Crammer, Hangfeng He, Dan Roth, Weijie J Su

Published: 28 Jan 2022, Last Modified: 04 May 2025ICLR 2022 OralReaders: Everyone

Keywords: Cross-task learning, Natural language processing, Representation learning

Abstract: In this paper, we introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning based on minimizing a representation-based task distance between the source and target tasks. We show that TAWT is easy to implement, is computationally efficient, requires little hyperparameter tuning, and enjoys non-asymptotic learning-theoretic guarantees. The effectiveness of TAWT is corroborated through extensive experiments with BERT on four sequence tagging tasks in natural language processing (NLP), including part-of-speech (PoS) tagging, chunking, predicate detection, and named entity recognition (NER). As a byproduct, the proposed representation-based task distance allows one to reason in a theoretically principled way about several critical aspects of cross-task learning, such as the choice of the source data and the impact of fine-tuning.

One-sentence Summary: We introduce a weighted training algorithm for cross-task learning based on minimizing a representation-based task distance between the source and target tasks.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/weighted-training-for-cross-task-learning/code)

9 Replies

Loading