LGGFormer: A dual-branch local-guided global self-attention network for surface defect segmentation

Published: 01 Jan 2025, Last Modified: 28 Jul 2025Adv. Eng. Informatics 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In industrial manufacturing, efficient and accurate surface defect detection is paramount. Recently, CNN-based defect segmentation networks have achieved significant success but have limitations in capturing global contextual information. Although Transformer models excel in global modeling, they often lack sufficient attention to local details. To combine the advantages of CNN and Transformer, this paper proposes a dual-branch local-guided global self-attention network (LGGFormer) for Surface Defect Segmentation. Considering the unique characteristics and computational differences between CNN and Transformer, we propose Local-Guided Global Attention Self-Attention (LGGSA) for extracting global and local information. LGGSA computes localized attention through a sliding window to capture rich contextual details. These local features are then aggregated for global attention computation, enabling the model to focus on areas signified as important by local information. To address the problems of tiny defects and low background contrast, we enhance the learning process by adding supervision to the CNN branch, forcing the branch to learn detailed boundary information. In addition, to take full advantage of the different modeling potentials of CNN and Transformer, we designed the Cross-Branch Feature Interaction Module (CBFI), which achieves a deep interaction between the two features through correlation-weighted integration to optimize feature extraction and representation. Finally, the edge-guided decoder (EGD) utilizes the boundary information extracted by the CNN to guide feature fusion to compensate for the loss of detail information. Experimental results on three public defect datasets demonstrate that our method exhibits promising performance.
Loading