Manual Verbalizer Enrichment for Few-Shot Text Classification

Anonymous

Manual Verbalizer Enrichment for Few-Shot Text Classification

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: With the emerging and continuous development of pre-trained language models, prompt-based training has become a well-adopted paradigm that drastically improves the exploitation of models for many NLP tasks. Prompting also shows great performance compared to traditional finetuning when adapted to zero-shot or few-shot scenarios where the number of annotated data is limited. In this framework, verbalizers play an important role in interpreting masked word distributions produced by language models into output predictions. In this work, we propose MaVEN, a new approach for verbalizer construction by enrichment of class labels using neighborhood relation in the embedding space of words. In addition, we elaborate a benchmarking procedure to evaluate typical baselines of verbalizers for document classification in few-shot learning contexts. Our model achieves state-of-the-art results while using significantly fewer resources. We show that our approach is particularly effective in cases with extremely limited supervision data. Our code is available at {https://anonymous.4open.science/r/verbalizer_benchmark-66E6}.

Paper Type: long

Research Area: Efficient/Low-Resource Methods for NLP

Contribution Types: NLP engineering experiment, Approaches to low-resource settings

Languages Studied: English, French

0 Replies

Loading