Defect detection on multi-type rail surfaces via IoU decoupling and multi-information alignment

Published: 2024, Last Modified: 15 Nov 2025Adv. Eng. Informatics 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Vision techniques can significantly improve rail surface inspection efficiency and reduce the cost of labor-intensive manual inspection. Rails in turnout areas are quite different from single straight rails, which are far more commonly studied. In particular, in the analysis of turnout rails, one encounters difficulties of diverse rail configurations, small defect localization in complex backgrounds, data imbalance, and label ambiguity caused by the diversity of box-level defect localization styles.To solve the above problems, we focus on effective defect feature representation from a novel detection network structure with balanced learning strategies, which leads to the proposed hybrid Transformer-CNN-based Defect Detection on Multi-Type Rail Surfaces via Multi-information Alignment and Intersection-over-Union (IoU) Decoupling (IDMA) consisting of three aspects: (1) A context-injected pyramid transformer-CNN-based feature extractor for hierarchical context understanding; (2) A multi-scale defect detector via multi-information alignment-enhanced decision-making, comprehensively leveraging key information from different tasks and scales; and (3) Balanced learning strategies via IoU decoupling and task alignment addressing sample imbalance and label ambiguity, also leading to more credible new criteria for evaluating dense defect detection. Experiments indicate that IDMA outperforms related state-of-the-art algorithms in two rail surface datasets and significantly extends applicable scenarios. Code is publicly available online: https://github.com/Xuefeng-Ni/IDMA.
Loading