Manual Verbalizer Enrichment for Few-Shot Text ClassificationDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: With the emerging and continuous development of pre-trained language models, prompt-based training has become a well-adopted paradigm that drastically improves the exploitation of models for many NLP tasks. Prompting also shows great performance compared to traditional finetuning when adapted to zero-shot or few-shot scenarios where the number of annotated data is limited. In this framework, verbalizers play an important role in interpreting masked word distributions produced by language models into output predictions. In this work, we propose MaVEN, a new approach for verbalizer construction by enrichment of class labels using neighborhood relation in the embedding space of words. In addition, we elaborate a benchmarking procedure to evaluate typical baselines of verbalizers for document classification in few-shot learning contexts. Our model achieves state-of-the-art results while using significantly fewer resources. We show that our approach is particularly effective in cases with extremely limited supervision data. Our code is available at {https://anonymous.4open.science/r/verbalizer_benchmark-66E6}.
Paper Type: long
Research Area: Efficient/Low-Resource Methods for NLP
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English, French
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview