Out-of-the-Box Performance of FPGAs for ML Workloads Using Vitis AI

Deepak Kumar Athur, Rutuparn Pawar, Aman Arora

Published: 01 Jan 2025, Last Modified: 13 May 2025ARC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Field Programmable Gate Arrays (FPGAs) are an attractive choice for accelerating Machine Learning (ML) workloads due to the flexible fabric of configurable logic blocks, interconnects, and embedded memory. However, programming FPGAs is difficult for ML developers as it requires intricate hardware knowledge. Even though high-level implementation solutions such as HLS are available, they come with their own challenges and a steep learning curve. To address this issue, FPGA vendors have raised the level of abstraction by providing ready-to-deploy frameworks for ML. In this paper, we present an evaluation of the out-of-the-box performance of FPGAs using AMD/Xilinx Vitis AI, a development environment for deploying ML models on FPGAs. The study aims to assess the inference performance of Vitis AI for both edge and cloud platforms. We benchmark various popular and standard pre-trained models focusing on latency, throughput, and power efficiency. Since Google Tensor Processing Units (TPUs) are a platform for out-of-the-box acceleration of ML, we compare these results with cloud TPU and edge TPU in terms of performance, ease of use, and tool support. We discuss the experience of working with Vitis AI, the strengths and limitations of Vitis AI as a plug-and-play solution for FPGA-based ML acceleration, providing insights for developers looking to leverage FPGAs for their inference workloads.