Fully Cascade Consistency Learning for One-Stage Object Detection

Hao Wang, Tong Jia, Bowen Ma, Qilong Wang, Wangmeng Zuo

Published: 01 Jan 2023, Last Modified: 31 Oct 2023IEEE Trans. Circuits Syst. Video Technol. 2023Readers: Everyone

Abstract: Object detection is usually solved by deploying one single prediction head including classification and localization branches to obtain the final results. Recently proposed works utilize several prediction heads in a cascade learning manner to improve the detection performance. Despite achieving promising performance, existing cascade learning manner methods still meet with two inconsistency issues. Firstly, most of them refine the bounding boxes in different prediction heads only by depending on the localization accuracy (i.e., IoU), while ignoring the inconsistency between classification confidence and localization accuracy. Moreover, simply increasing the IoU threshold by experience to select positive samples makes the inconsistency issue even worse. Secondly, little consideration has been paid on the feature inconsistency between detection-specific features from different prediction heads and detection-generalized ones from backbone model. The extracted feature from backbone model contains the general representation for the whole images. While prediction heads need to be carefully designed to have specific ability which contains more discriminative expressions for the two sub-tasks classification and regression. The different contexture representations of the output features from these two parts lead to the feature inconsistency between backbone model and prediction head in cascade learning architecture. To solve these two inconsistency issues, this paper proposes a novel cascade consistency learning method for one-stage detector. Specifically, a feature adaptation module is firstly developed to calibrate features from different prediction heads and backbone model for solving the feature inconsistency. Then, we design an automatic positive sample threshold selection strategy for further solve the inconsistency between the classification and localization predictions. Moreover, the quality of bounding boxes in cascade learning manner are evaluated by taking both the classification confidence and localization accuracy into consideration. Experiments on MS COCO show that our proposed cascade consistency learning manner (dubbed <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\text{C}^{2}\text{L}$ </tex-math></inline-formula> ) can achieve clear improvement over counterparts based on several different one-stage detectors, while performing favorably against state-of-the-arts.

0 Replies