A multi-task mean teacher with two stage decoder for semi-supervised crack detection

Published: 01 Jan 2024, Last Modified: 16 Jun 2025Multim. Tools Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Crack detection is a simple but very practical computer vision task. Existing crack detection methods only supervise cracks on limited annotated data, which has limited their detection effectiveness. This paper aims to achieve crack detection by simultaneously learning cracks and crack-related information while using unlabeled data to train a multi-task mean teacher. Specifically, considering the characteristics of cracks, we construct the concept of multi-task from three perspectives of crack region, edge classification for both crack and noise, and crack count. All tasks will be explicitly supervised. Then we build a two stage decoder on top of using the powerful backbone as the encoder. Our first-stage decoder consists of a short connection and a first-stage prediction head. The latter enhances the representation of crack region in multiple ways and generates earlier and stronger feedback for network optimization. The second-stage decoder is mainly composed of a unified cross interaction module, which aims to facilitate the interaction between crack and edge category. Finally, we distribute our encoder and decoder to the student and teacher networks. On multiple crack benchmark datasets, our method outperforms other SOTA methods in all metrics. For example, AIU of 0.2243 and 0.2862 are achieved on the GAPS384 and CFD datasets, respectively. Furthermore, extensive ablation experiments confirm the rationality and effectiveness of our multi-task and decoder design.
Loading