[Re] Distilling Knowledge via Knowledge ReviewDownload PDF

Anonymous

05 Feb 2022 (modified: 05 May 2023)ML Reproducibility Challenge 2021 Fall Blind SubmissionReaders: Everyone
Keywords: knowledge distillation, review mechanism
Abstract: Scope of Reproducibility This effort aims to reproduce the results of experiments and analyze the robustness of the review framework for knowledge distillation introduced by Chen et al. We consistently verify the improvement in test accuracy across student models as reported and study the effectiveness of the novel modules introduced by the authors by conducting ablation studies and new experiments. Methodology We start the reproduction effort by using the code open-sourced by the authors. We reproduce Tables 1 and 2 from the original paper using the same. As we proceed further, we refactor and re-implement the code for a specific architecture (ResNets) and refer to the authors’ code for specific implementation details (further discussed in Section 3.2). We implement the ablation studies mentioned in the original paper and design experiments to verify the claims made by the authors. We release our code as open-source. Results We reproduce the review mechanism on the CIFAR-100 dataset within 0.8% of the reported values. The claim to achieve SOTA performance on the image classification task is verified consistently with different student models. The ablation studies help us understand the significance of novel modules proposed by the authors. The experiments conducted on the framework’s components further strengthen the claims made and help get further insights. What was easy The authors open-sourced the code for the paper. This made it easy to verify many results reported in the paper (specifically Tables 1 and 2 in the original paper). The framework of the review mechanism was well described mathematically in the paper, which made its implementation easier. The writing was simple and the diagrams used were self-explanatory, which aided our conceptual understanding of the paper. What was difficult While the framework of the review mechanism was well described, further specifications of the architectural components, ABF (residual output and ABF output as mentioned in Section 4.2.3) and HCL (number, sizes, and weights of levels as mentioned in Section 4.2.2) could have been provided. These details would have made it easier to translate the architecture into code. The most challenging part remained the lack of resources and time to run our experiments. Each run took around 4-5 hours, making it difficult for us to report results averaged over multiple runs. Communication with original authors During the course of this study, we tried to contact the original authors more than once through e-mail. Unfortunately we were unable to get any response from them.
Paper Url: https://openaccess.thecvf.com/content/CVPR2021/papers/Chen_Distilling_Knowledge_via_Knowledge_Review_CVPR_2021_paper.pdf
Paper Venue: CVPR 2021
Supplementary Material: zip
5 Replies

Loading