Comprehensive Attention Self-Distillation forWeakly-Supervised Object Detection - A Reproducibility Study

Bradley Ezard

Comprehensive Attention Self-Distillation forWeakly-Supervised Object Detection - A Reproducibility Study

Bradley Ezard

Published: 11 Apr 2022, Last Modified: 05 May 2023RC2021Readers: Everyone

Keywords: attention, weakly-supervised, object detection, computer vision, vision, distillation, self-learning

Abstract: # Reproducibility Summary ## Scope of Reproducibility We perform extensive ablation studies reproducing the results of the paper "Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection." In this paper the authors propose a method of regularisation via aggregating attention maps to improve the accuracy of weakly supervised object detectors. They propose to aggregate attention maps from different layers of the network, and across different views of the same image, and use these maps to encourage features with better object coverage. The paper claims that this allows them to train an object detector in a weakly supervised manner and achieves state-of-the-art results. ## Methodology Using the official code released by the authors we have re-run the ablation experiments on the Pascal VOC 2007 dataset using our own GPU compute resources. We made changes to the released code to further investigate the effects of other model components, extending the ablation studies. We have also performed an in-depth analysis of the code itself to assess code quality for ease of modification and reuse. ## Results We found that we were able to meet or exceed the published results. Further ablation studies showed that we were able to achieve the same results without the proposed attention regularisation. Our results do not agree with the original paper's claims that the novel regularisation provides an improved object detector. We performed further ablation studies to identify the source of the improvement, attributing it to the ContextLocNet-style head, Inverted Attention module, regression branch, and stronger data augmentation. ## What was easy The fundamental concept is simple and intuitive. The figures in the paper work to make it even clearer, and provide a strong motivation as to why this might be useful. Some parts of the codebase are reused from other public weakly-supervised object detection codebases, anyone familiar with those works will have an easier time following this one. ## What was difficult There was significant difficulty in working with the code due to poor coding practices, poorly documented requirements, and errors in the code. Experimentation was limited due to the slow training process, which owes both to the computational cost of this approach as well as some inefficient implementation choices. ## Communication with original authors We had no direct communication with the original authors. We browsed public communications via GitHub issues posted to the public code repository.

Paper Url: https://openreview.net/forum?id=Fc9wMprCcF0

Supplementary Material: zip

3 Replies

Loading