ECF-YOLOv7-Tiny: Improving Feature Fusion and the Receptive Field for Lightweight Object Detectors

Published: 01 Jan 2025, Last Modified: 04 Nov 2025WACV 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this work, we aim to increase the efficiency and the detection performance of lightweight object detectors, with focus on feature fusion and receptive field of the models. For improved feature fusion, we introduce the Convolutional Squeeze-and-Excitation (CSE) module, which requires only minimal additional computation. For improving the receptive field and feature extraction capabilities in a resource effective manner, we introduce the Cross-Stage Partial Context Augmentation Module (CSP-CAM). Furthermore, for improving real-time performance, we apply two model scaling techniques with minimal impact on the detection performance. We prove the effectiveness of the proposed modules by inserting them into YOLOv7-tiny and YOLOv9-t. We build a new network architecture, ECF-YOLOv7-tiny, which we train on the MS COCO dataset and evaluate the inference speed on NVIDIA Jetson Nano. ECF-YOLOv7-tiny achieves 37.8% mAP @ [0.5:0.95] on the test set while reaching 9.3 FPS on Jetson Nano, outperforming the other state-of-the-art lightweight object detectors on the 416 input image resolution. Code and models are released at: https://github.com/dbacea/ecf-yolov7-tiny.
Loading