Heterogeneous Parallel Acceleration for Edge Intelligence Systems: Challenges and Solutions

Published: 01 Jan 2025, Last Modified: 16 May 2025IEEE Consumer Electron. Mag. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The rapid advancement of edge artificial intelligence can be attributed to the widespread use of edge consumer devices and the enhancement in system-on-chip capabilities. As modern edge consumer devices are equipped with heterogeneous computing units, e.g., CPU, GPU, digital signal processor, and neural network processing unit, effectively harnessing these resources remains a challenge due to the disparate architectures of heterogeneous processors and the performance fragmentation issues in deep learning (DL) libraries. This article comprehensively analyzes the challenges and bottlenecks in maximizing the utilization of heterogeneous computing resources on edge consumer devices. It explores the complexities of hardware components and proposes leveraging them to optimize performance and energy efficiency. We present a parallel scheduling algorithm based on the affinity between DL libraries and hardware to align each computing unit's strengths with the library's requirements, achieving full utilization of processor computing capabilities, and reducing system latency. Finally, we summarize future optimization directions for heterogeneous parallel computing on edge consumer devices.
Loading