Chapter Three - Hardware accelerator systems for artificial intelligence and machine learning

Published: 01 Jan 2021, Last Modified: 13 Nov 2024Adv. Comput. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recent progress in parallel computing machines, deep neural networks, and training techniques have contributed to the significant advances in artificial intelligence (AI) with respect to tasks such as object classification, speech recognition, and natural language processing. The development of such deep learning-based techniques has enabled AI-based networks to outperform humans in the recognition of objects in images. The graphics processing unit (GPU) has been the primary component used for parallel computing during the inference and training phases of deep neural networks. In this study, we perform training using a desktop or a server with one or more GPUs and inference using hardware accelerators on embedded devices. Performance, power consumption, and requirements of embedded system present major hindrances to the application of deep neural network-based systems using embedded controllers such as drones, AI speakers, and autonomous vehicles. In particular, power consumption of a commercial GPU commonly surpasses the power budget of a stand-alone embedded system.To reduce the power consumption of hardware accelerators, reductions in the precision of input data and hardware weight have become popular topics of research in this field. However, precision and accuracy share a trade-off relationship. Therefore, it is essential to optimize precision in a manner that does not degrade the accuracy of the inference process. In this context, the primary issues faced by hardware accelerators are loss of accuracy and high power consumption.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview