Retrieval-Augmented Few-shot Text Classification

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX
Submission Type: Regular Long Paper
Submission Track: Sentiment Analysis, Stylistic Analysis, and Argument Mining
Submission Track 2: Information Retrieval and Text Mining
Keywords: Retrieval-augmented methods, Few-shot text classification
TL;DR: We propose two new training objectives to adapt retrieval-augmented methods in few-shot scenarios.
Abstract: Retrieval-augmented methods are successful in the standard scenario where the retrieval space is sufficient; whereas in the few-shot scenario with limited retrieval space, this paper shows it is non-trivial to put them into practice. First, it is impossible to retrieve semantically similar examples by using an off-the-shelf metric and it is crucial to learn a task-specific retrieval metric; Second, our preliminary experiments demonstrate that it is difficult to optimize a plausible metric by minimizing the standard cross-entropy loss. The in-depth analyses quantitatively show minimizing cross-entropy loss suffers from the weak supervision signals and the severe gradient vanishing issue during the optimization. To address these issues, we introduce two novel training objectives, namely EM-L and R-L, which provide more task-specific guidance to the retrieval metric by the EM algorithm and a ranking-based loss, respectively. Extensive experiments on $10$ datasets prove the superiority of the proposed retrieval augmented methods on the performance.
Submission Number: 3059
Loading