Abstract: This paper studies the problem of generalized zero-shot learning which requires the model to train on image-label pairs from some seen classes and test on the task of clas- sifying new images from both seen and unseen classes. In this paper, we propose a novel model that provides a uni- fied framework for three different approaches: visual → semantic mapping, semantic → visual mapping, and deep metric learning. Specifically, our proposed model con- sists of a feature generator that can generate various vi- sual features given class embedding features as input, a regressor that maps each visual feature back to its corre- sponding class embedding, and a discriminator that learns to evaluate the closeness of an image feature and a class embedding. All three components are trained under the combination of cyclic consistency loss and dual adversar- ial loss. Experimental results show that our model not only preserves higher accuracy in classifying images from seen classes, but also performs better than existing state-of-the- art models in classifying images from unseen classes.
0 Replies
Loading