Multi-Scale CNN-Transformer Hybrid Network for Rail Fastener Defect Detection

Published: 23 Feb 2025, Last Modified: 15 May 2025OpenReview Archive Direct UploadEveryoneRevisionsCC BY 4.0
Abstract: Defect detection in rail fasteners is crucial for train safety, as defective fasteners can cause derailments and severe safety incidents. However, Existing algorithms often struggle in various real-world scenarios due to challenges such as obscured fasteners, motion blur in images, varying camera angles, and fasteners submerged in water. To address these challenges, we propose a Multi-scale CNN-Transformer Hybrid Network for Rail Fastener Defect Detection (MCHNet-RF2D), specifically designed to identify fastener defects in complex environments. Our approach constructs an efficient CNN block and a multi-scale Vision Transformer block to alternately extract local detail features and global semantic features of the fasteners. These features are seamlessly integrated through multi-scale fusion to enhance defect recognition robustness. By combining comprehensive global recognition with detailed local defect detection, MCHNet-RF2D outperforms existing CNN-Transformer hybrid networks by 2.8% and surpasses current fastener defect detection algorithms by 2.9%. In practical deployment on over 40 trains, our model successfully detected more than 2,000 fastener defects, demonstrating its effectiveness in diverse and challenging conditions.
Loading