[Re] Latent Embedding Feedback and Discriminative Featuresfor Zero-Shot Classification

Anonymous

[Re] Latent Embedding Feedback and Discriminative Featuresfor Zero-Shot Classification

Anonymous

05 Feb 2022 (modified: 06 Jul 2025)ML Reproducibility Challenge 2021 Fall Blind SubmissionReaders: Everyone

Keywords: Zero-shot Classification, GANs, VAE, Reconstruction

Abstract: In this study, we show our results and experience during replicating the paper titled "Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification". The paper proposes incorporating a feedback loop for generating refined features during both the training and feature synthesis stages which leads to reduced ambiguity among classes. We have updated the model for the recent PyTorch version. We were able to reproduce both the quantitative and qualitative results, as reported in the paper which includes inductive, finetuning and reconstruction of the original images from synthesized features. The authors have open-sourced their code for inductive setting. We have implemented the codes for the finetuning setting and reconstruction of the images. Scope of Reproducibility TF-VAEGAN proposes to enforce a semantic embedding decoder (SED) at training, feature synthesis and classification stages of (generalized) zero-shot learning. They introduce a feedback loop, from SED for iteratively refining the synthesized features during both the training and feature synthesis stages. The synthesized features, along with their corresponding latent embeddings from the SED are then transformed into the discriminative features and utilized during the classification stage to reduce ambiguities among the categories. Methodology As the TF-VAEGAN method was available in PyTorch 0.3.1, we ported the entire pipeline to PyTorch 1.6.0 along with implementing the finetuning and reconstruction codes from scratch. Our implementation is based on the original code and on the discussions with the authors. Our implementation can be found at https://anonymous.4open.science/r/ZSL_Generative-D397/. Total training times for each method ranged from 2-8 hours on Caltech-UCSD-Birds (CUB), Oxford Flowers (FLO), SUN Attribute (SUN), and Animals with Attributes2 (AWA2) on a single NVIDIA Tesla V100 GPU. Further details are presented in Table 3 of the report. Results We were able to reproduce the results quantitatively on all the four datasets as reported in the original paper as well as reconstruct the original images from the generated features. What was easy The authors' code was well written and documented, and we were able to reproduce the preliminary results using the documentation provided with the code. The authors were also extremely responsive and helpful via email. What was difficult The feature reconstruction codes from previous baselines are not available in PyTorch. Therefore, we had to implement it in PyTorch along with a hyperparameter search to get the images. We also performed a hyperparameter search for getting the finetuning results. Communication with original authors We reached out to the authors a few times via email to ask for clarifications and additional implementation details.

Paper Url: https://openreview.net/forum?id=tEgBJaAgs0Uh

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/latent-embedding-feedback-and-discriminative/code)

4 Replies

Loading