HCA-YOLO: a non-salient object detection method based on hierarchical attention mechanism

Published: 01 Jan 2024, Last Modified: 15 May 2025Clust. Comput. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The objective of deep learning-based object detection is to accurately localize and recognize objects of interest from images or videos using neural networks. However, the detection and localization of non-salient objects pose challenges due to their small proportions, low contrast, and occlusion in images. To address this, we propose an improved object detection method, namely hierarchical coordinate attention (HCA)-YOLO, based on the YOLOv8 architecture. Specifically, we enhance the model's attention towards non-salient objects by introducing HCA, building upon the optimized YOLOv8 baseline. Additionally, we propose a novel object regression loss metric, β-VIoU, to improve YOLOv8’s perception of non-salient object positions. Our method achieves competitive results on multiple metrics with two widely adopted open-source datasets, MS COCO 2017 and CrowdHuman. Compared to the YOLOv8x baseline model, HCA-YOLO improves the average precision (mAP) by 3.3% and 3.7% on these two datasets, respectively.
Loading