Zero-Shot Image Classification Based on Deep Feature Extraction

Xuesong Wang, Chen Chen, Yuhu Cheng, Z. Jane Wang

Published: 2018, Last Modified: 05 Nov 2023IEEE Trans. Cogn. Dev. Syst. 2018Readers: Everyone

Abstract: The attribute-based zero-shot learning methods generally use low-level features of images to train attribute classifiers, and the corresponding classification accuracy heavily depends on specific low-level features. Because deep networks can automatically extract features from original unlabeled images and the extracted features can better represent the nature of original images, we proposed a zero-shot image classification method based on deep feature extraction. In the image preprocessing step, in order to reduce the computational complexity and the correlations between pixels, image patches extraction and zero-phase component analysis whitening are performed. The compressed feature representations of unlabeled image patches are learned through a stacked sparse autoencoder and a feature mapping matrix can be obtained. Further, we use the feature mapping matrix as a convolution kernel to convolve with image patches. Since the convolution operation results in the feature vector with huge dimensionality, the convolution features will be pooled to reduce the number of network parameters and to reduce the spatial resolution of the network to prevent over-fitting. Finally, the exacted image features are used to train the conventional indirect attribute prediction model to predict image attributes and classify images under the zero-shot setting. Experimental results on the shoes, outdoor scene recognition, and a-Yahoo datasets show that, compared with several popular zero-shot learning methods, the proposed method can yield more accurate attribute prediction and better zero-shot image classification.

0 Replies