SCDet: Scale-aware and Context-rich Feature Fusion Network for Traffic Sign Detection

Xin Li, Yan Ke, Wendong Zhang, Bo Wang

Published: 23 Mar 2025, Last Modified: 10 May 2026OpenReview Archive Direct UploadEveryonearXiv.org perpetual, non-exclusive license

Abstract: With the rise of deep learning technology, traffic sign detection has made great progress. However, due to the diversity and complexity of the collected high-resolution images, detecting small, multi-scale, and easily obscured traffic signs in real-world scenarios still presents a persistent challenge. Aiming at this problem, a new traffic sign detection network, Scale-aware and Context-rich Traffic Sign Detection Network (SCDet) is proposed, which learns scale-aware and context-rich features. Specifically, the network constructs a backbone network Multi-Scale Deep Residual Network(MSNet) firstly, and we replace the ordinary convolution in the backbone with a dilated convolution module, where different sensory fields acquired by different layers help to acquire multi-scale contextual information, thus facilitating the detection of multi-scale objects in traffic signs. Secondly, to suppress the effect of scale variation, we replace the nearest neighbour interpolation in Feature Pyramid Network(FPN) with Content-Aware ReAssembly of Features(CARAFE), which performs multi-scale fusion based on scale-aware and context-rich representation. Moreover, to solve the problem of data imbalance between different classes, a method called configurable dynamic data expansion is proposed. Our approach has been extensively evaluated on two public traffic sign datasets through extensive experiments, demonstrating its effectiveness and superiority over several state-of-the-art approaches.