DBMF-Net: A Dual-Branch Multimodal Fusion Network for Multi-label Sewer Defect Classification

Published: 01 Jan 2024, Last Modified: 20 Feb 2025PRCV (9) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: As a key component of urban infrastructure, the maintenance and inspection of sewer pipes is crucial to ensure the normal operation of cities. However, existing methods for multi-label sewer pipe defect classification encounter difficulties in utilizing muti-scale and contextual information within images and often overlook the intrinsic associations between labels. To address these challenges, we propose a novel Dual-Branch Multimodal Fusion Network (DBMF-Net). In DBMF-Net, feature maps extracted by the backbone are forwarded to two parallel streams: the Grouped Multi-scale Attention Residual Block (GMARB) branch and the Correlative Information Embedding (CIE) branch. The GMARB subnetwork focuses on capturing multi-scale and contextual information within images, thereby enhancing the features extracted by the backbone network. Simultaneously, the CIE module fuses visual features with the semantic information of labels and enhances feature representations by mining and incorporating the relevant correlations between defect labels. The enhanced features fused from both branches are used for final predictions. Comprehensive experiments on a mainstream benchmark dataset and our privately established dataset have demonstrated the effectiveness and superiority of the proposed architecture.
Loading