Contrastively-Trained Cross-Attention Improves Zero-Shot Natural Language UnderstandingDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: Encoder-Decoder trained with a contrastive NLI with intstruction dataset improves Zero-Shot Text Classification on GLUE and closed-set datasets.
Abstract: Developing a general purpose model that can tackle many different Natural Language Understanding (NLU) tasks without requiring manually annotated data has become an ambitious yet desirable goal for the NLP research community. A simple and prominent approach for zero-shot text classification is to train a model on a generic language understanding task such as Natural Language Inference (NLI), and perform inference on NLU classification tasks using instructions or candidate templates. Those methods jointly encode the input document and the instruction into a single sequence leveraging self-attention layers and the next-sentence-prediction (NSP) pre-training task. We hypothesize that this joint encoding limits the capabilities of large pre-trained encoders while being sub-optimal in many practical applications. To tackle those issues, we propose a novel approach that separates the encoding of the input document and use it as a ground reference to enhance the encoding of the instruction through cross-attention using an encoder-decoder architecture. We further propose a simple transformation on traditional NLI datasets that focuses on the learning of these Cross-Attention layers using contrasted data. Finally, we show that this approach do not need a full-sized decoder for best performance. Our experiments show that the proposed approach outperforms similar approaches by a large margin and sometimes achieves comparable results to fully fine-tuned methods.
Paper Type: long
Research Area: Efficient/Low-Resource Methods for NLP
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: english
0 Replies

Loading