Visual semantic alignment network based on pre-trained ViT for few-shot image classification

Published: 2024, Last Modified: 08 Apr 2025APSIPA 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Images often contain a wealth of information, but not all the information in a picture is related to image classification, and those modules that are not related to classification may be used as interference during classification, affecting the classification ability of the model. To solve this problem, we propose a class-aware feature alignment adaptive module (CAFASA). The key idea of CAFASA is first to combine patch embedding and class-aware embedding and add a feature alignment module on top of this foundation to help the model obtain better classification performance. This method is significantly superior to the existing state-of-the-art methods in CIFAR-FS, miniImageNet, tieredImageNet, and FC-100.
Loading