A Coarse to Fine Detection Method for Prohibited Object in X-ray Images Based on Progressive Transformer Decoder

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Currently, Transformer-based prohibited object detection methods in X-ray images appear constantly, but there are still some shortcomings such as poor performance and high computational complexity for prohibited object detection with heavily occlusion. Therefore, a coarse to fine detection method for prohibited object in X-ray images based on progressive Transformer decoder is proposed in this paper. Firstly, a coarse to fine framework is proposed, which includes two stages: coarse detection and fine detection. Through adaptive inference in stages, the computational efficiency of the model is effectively improved. Then, a position and class object queries method is proposed, which improves the convergence speed and detection accuracy of the model by fusing the position and class information of prohibited object with object queries. Finally, a progressive Transformer decoder is proposed, which distinguishes high and low score queries by increasing confidence thresholds, so that high-score queries are not affected by low-score queries in the decoding stage, and the model can focus more on decoding low-score queries, which usually correspond to prohibited object with severe occlusion. The experimental results on three public benchmark datasets (SIXray, OPIXray, HiXray) demonstrate that compared with the baseline DETR, the proposed method achieves the state-of-the-art detection accuracy with a 21.6% reduction in model computational complexity. Especially for prohibited objects with heavily occlusion, accurate detection can be carried out.
Primary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: Occlusion is a common problem in multimedia processing, especially in prohibited object detection of X-ray images. Due to the characteristics of X-rays, occlusion usually appears in a form different from natural images, that is, different substances overlap and occlude each other. This situation is very challenging for the design of the algorithm. Because of its powerful global attention mechanism, Transformer has natural advantages in dealing with occlusion tasks. However, in the current field of X-ray image prohibited object detection, the research based on Transformer is still in the exploration stage, and there are still many problems, such as slow detection speed, detection accuracy can not meet the actual application requirements, etc. In view of these problems, a coarse to fine detection method for prohibited object in X-ray images based on progressive Transformer decoder is proposed in this paper. Through a two-stage adaptive inference, this method can solve the problem of slow detection speed of traditional Transformer, and greatly improve the detection accuracy of occlusion prohibited object. This method not only improves the performance of X-ray image analysis, but also can be widely used in multimedia processing tasks such as object tracking, video salient object detection, video content understanding and analysis.
Supplementary Material: zip
Submission Number: 3670
Loading