Densely packed object detection with transformer-based head and EM-merger

Xiaojing Zhong, Ni Zhang, Hao Hu, Li Li, Junhua Cen, Qingyao Wu

Published: 01 Jan 2023, Last Modified: 19 Mar 2024Serv. Oriented Comput. Appl. 2023Readers: Everyone

Abstract: Due to the high density of objects and their varying sizes, detecting them accurately and without repetition in such scenarios is more challenging than traditional object detection methods. In this paper, we propose a YOLOv5-based object detection approach equipped with a Transformer-based Head and EM-Merger unit specifically designed for densely packed scenes. We incorporate the transformer architecture into the prediction heads to enable a self-attention mechanism that captures long-term dependencies between the densely packed objects. Additionally, we introduce an EM-Merger unit to resolve redundant object detections. Experimental results on the RebarDSC and SKU110K datasets demonstrate that our method significantly outperforms the baseline approach, achieving new state-of-the-art detection performance.

0 Replies