EncT5: Fine-tuning T5 Encoder for Discriminative Tasks

Anonymous

EncT5: Fine-tuning T5 Encoder for Discriminative Tasks

Anonymous

17 Aug 2021 (modified: 05 May 2023)ACL ARR 2021 August Blind SubmissionReaders: Everyone

Abstract: Encoder-decoder transformer architectures have become popular recently with the advent of T5 models. While they demonstrate impressive performance on benchmarks such as GLUE (Wang et al., 2019), it is not clearly evident if the proposed encoder-decoder architecture is the most efficient for fine-tuning on downstream discriminative tasks. In this work, we study fine-tuning pre-trained encoderdecoder models such as T5. Particularly, we propose EncT5 as a way to efficiently finetune pre-trained encoder-decoder T5 models for classification and regression tasks by using only the encoder layers. Our experimental results show that EncT5 with less than half of the parameters of T5 performs similarly to T5 models on GLUE benchmark. We believe our proposed approach can be easily applied to any pre-trained encoder-decoder model.

0 Replies

Loading