A Parameterisable FPGA-Tailored Architecture for YOLOv3-Tiny

Zhewen Yu; Christos-Savvas Bouganis

A Parameterisable FPGA-Tailored Architecture for YOLOv3-Tiny

Zhewen Yu, Christos-Savvas Bouganis

Published: 01 Jan 2020, Last Modified: 15 May 2024ARC 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Object detection is the task of detecting the position of objects in an image or video as well as their corresponding class. The current state of the art approach that achieves the highest performance (i.e. fps) without significant penalty in accuracy of detection is the YOLO framework, and more specifically its latest version YOLOv3. When embedded systems are targeted for deployment, YOLOv3-tiny, a lightweight version of YOLOv3, is usually adopted. The presented work is the first to implement a parameterised FPGA-tailored architecture specifically for YOLOv3-tiny. The architecture is optimised for latency-sensitive applications, and is able to be deployed in low-end devices with stringent resource constraints. Experiments demonstrate that when a low-end FPGA device is targeted, the proposed architecture achieves a 290x improvement in latency, compared to the hard core processor of the device, achieving at the same time a reduction in mAP of 2.5 pp (30.9% vs 33.4%) compared to the original model. The presented work opens the way for low-latency object detection on low-end FPGA devices.

Loading