MICD: More intra-class diversity in few-shot text classification with many classes

Gwangseon Jang, Hyeon Ji Jeong, Mun Yong Yi

Published: 01 Jan 2025, Last Modified: 20 May 2025Knowl. Based Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Few-shot learning has gained much interest and achieved remarkable performance in handling limited data scenarios. However, existing few-shot text classification methods typically aim at classifying a limited number of classes, usually ranging from 5 to 10, posing a challenge for many real-world tasks that require few-shot text classification for many classes. Few-shot text classification for many classes has rarely been studied and it is a challenging problem. Distinguishing differences among many classes is more challenging than distinguishing differences among small classes. To address this issue, we propose a new few-shot text classification model for many classes called MICD (More Intra-Class Diversity in few-shot text classification with many classes). Our model comprises two crucial components: Intra-Class Diversity Contrastive Learning (ICDCL) and Intra-Class Augmentation (ICA). ICDCL trains an encoder to enhance feature discriminability by maintaining both intra-class diversity and inter-class specificity, effectively improving generalization performance, even when data is limited. ICA addresses data scarcity by selecting diverse support samples and applying intra-class mix-up, enabling robust generalization to out-of-distribution data—an essential consideration in many-class few-shot learning scenarios. Experimental results on four real datasets show that MICD provides significant performance improvement over the other state-of-the-art approaches.