everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
The key issue of zero-shot image recognition (ZIR) is how to infer the relationship between visual space and semantic space from seen classes, and then effectively transfer the relationship to unseen classes. Recently, most methods have focused on how to use images and class semantic vectors or class names to learn the relationship between visual space and semantic space. The relationship established by these two methods is class-level and coarse-grained. The differences between images of the same class are ignored, which leads to insufficiently tight relationships and affects the accurate recognition of unseen classes.To tackle such problem, we propose Common Feature learning for Zero-shot Image Recognition (CF-ZIR) method to learn fine-grained visual semantic relationships at the image-level. Based on the inter class association information provided by class semantic vectors, guide the extraction of common visual features between classes to obtain image semantic vectors. Experiments on three widely used benchmark datasets show the effectiveness of the proposed approach.