Keywords: Hypergraph, Deep Metric Learning, Semantic Relations
TL;DR: Reproducibility report for paper "Hypergraph-Induced Semantic Tuplet Loss for Deep Metric Learning"
Abstract: Reproducibility Summary Scope of Reproducibility — Our work aims to reproduce the critical findings of the paper "Hypergraph-Induced Semantic Tuplet(HIST) Loss for Deep Metric Learning" and investigate the effectiveness and robustness of HIST loss with the following five claims, which point that: (i) The proposed HIST loss performs consistently regardless of the batch size. (ii) Regardless of the quantity of hyper-graph-neural-network(HGNN) layers L, the HIST loss shows consistent performance. (iii) The positive value of the scaling factor α of semantic tuplets contributes to reliable performance for modeling semantic relations of samples. (iv) The large temperature parameter τ is effective; if τ >16, HIST loss is insensitive to the scaling parameter. (v) The HIST loss contributes to achieve SOTA performances under the standard evaluation settings. Methodology — To verify the aforementioned claims, we partially reimplement and extend the experiments proposed in the original paper based on the existing repositories and evaluate the performances on CARS196, CUB-200-2011, and Stanford Online Products datasets under the same standard settings as the original paper. Our study will consist of the following three parts: (a) reproducing the performances on HIST loss under standard evaluation settings and hyperparameters proposed in the original paper. (b) exploring the best performance of HIST loss via Bayesian optimization and examining the results on the above datasets. (c) investigating the impacts and robustnesses of three key modules(HGNN, prototypical distributions, and semantic tuplets) under distinct parameter settings. In addition, all experiments were performed on 2 NVIDIA V100 GPUs and took approximately 1,108 GPU hours. Results — Overall, this study reveals that our reproduced and improved results exhibit strong consistency with three(iii, iv, and v) out of the five primary claims proposed in the original paper. However, the other two claims(i and ii) cannot be fully supported by our reproduced results. In addition, by employing the hyperparameters and configurations given in the original paper, we obtained comparable performances as proposed in the original paper on the CARS196 dataset. However, large deviances were observed for CUB‐200‐2011 and SOP datasets, which dropped by 1.5% and 1% R@1 using ResNet50 as the backbone. Hence, we utilized Bayesian optimization for hyperparameter searching and achieved better performance over the results reported in the original paper on CARS196 and CUB‐200‐2011 with ResNet50, which are improved by 0.7% and 0.4% R@1. Moreover, we close the performance gap for the SOP dataset from ‐1% to ‐0.6% compared to the proposed results in the original paper using ResNet50 as the backbone. What was easy — The original paper(OP) explains in depth the proposed methods, e.g. semantic tuples, HIST loss, and learnable prototypical distributions. These, along with a well‐structured and ‐written work, allow us to comprehend the paper’s primary concepts clearly. Benefiting from those factors and the existing codebase, our implementation, and extension of partial experiments are highly efficient. What was difficult — The first challenge is that the original authors(OA) did not clearly describe the joint contributions between various modules, such as hidden sizes of the HGNN and embedding sizes of the backbone. Therefore, we extended the codebase and retrained the model to verify the impact of these two factors. In addition, the performance of HIST loss cannot be reproduced with comparable results as those proposed in OA using the reported hyper‐parameters and experimental setup on CUB‐200‐2011 and SOP datasets. To address this, an additional hyperparameter search has to be performed, which is time‐consuming. In addition, the OA did not provide details for the multi‐layered HGNN. Communication with original authors — We attempted to contact the OA for more details regarding the hyperparameter settings for each dataset, especially for CUB‐200‐2011 and SOP datasets, as well as inquiries about design decisions not addressed in the original paper, such as the reimplementation of multi‐layered HGNN and different distance metrics. Unfortunately, before completing this report, we did not receive any responses from the OA.
Paper Url: https://openaccess.thecvf.com/content/CVPR2022/papers/Lim_Hypergraph-Induced_Semantic_Tuplet_Loss_for_Deep_Metric_Learning_CVPR_2022_paper.pdf
Paper Venue: CVPR 2022
Supplementary Material: zip
Confirmation: The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page.
Journal: ReScience Volume 9 Issue 2 Article 8