Amber: A 367 GOPS, 538 GOPS/W 16nm SoC with a Coarse-Grained Reconfigurable Array for Flexible Acceleration of Dense Linear Algebra

Published: 01 Jan 2022, Last Modified: 05 Sept 2024VLSI Technology and Circuits 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Amber is a system-on-chip (SoC) with a coarse-grained reconfigurable array (CGRA) for acceleration of dense linear algebra applications such as machine learning (ML), image processing, and computer vision. It achieves a peak energy efficiency of 538.0 INT16 GOPS/W and 483.3 BFloat16 GFLOPS/W. We maximize CGRA utilization and minimize reconfigurability overhead through (1) dynamic partial reconfiguration of the CGRA that enables higher resource utilization by allowing multiple applications to run at once, (2) efficient streaming memory controllers supporting affine access patterns, and (3) low-overhead transcendental and complex arithmetic operations. Compared to a CPU, a GPU, and an FPGA, Amber achieves up to 3902x, 152x, and 88x better energy-delay product (EDP).
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview