Multiscale Adaptive Prototype Transformer Network for Few-Shot Strip Steel Surface Defect Segmentation
Abstract: In recent years, few-shot semantic segmentation (FSS) of strip steel surface defects (S3D) has posed significant challenges, primarily due to the wide diversity of scenarios and the inherent complexity of defects. However, existing semantic segmentation methods have not effectively addressed these issues. In this article, we propose a few-shot segmentation method called the multiscale adaptive prototype transformer network (MAPTNet), which aims to integrate multiscale feature aggregation and enhance the adaptability of defect detection across diverse and complex defect scenarios. Specifically, we introduce an adaptive prototype transformer (APT) module to locate defects better and generate more refined feature representations. This module effectively explores the relationships between support and query sets and transforms them into optimal prototypes. Additionally, through hierarchical feature fusion to capture more detailed and contextually relevant S3D information, our MAPTNet improves the discriminability of defects at multiple feature scales. To further strengthen the learning process across different feature layers, our MAPTNet incorporates a deep dual supervision mechanism, facilitating effective optimization at both intermediate and final stages. Extensive experiments on the FSSD-12, Surface Defects-4i, and ESDIs-SOD datasets demonstrate that our network achieves state-of-the-art performance, significantly outperforming existing approaches. The code is available at https://github.com/hhhjjc/MAPTNet.
External IDs:doi:10.1109/tim.2025.3550227
Loading