Hardware-Efficient Quantization for Green Custom Foundation Models

Toshiaki Koike-Akino; Chang Meng; Volkan Cevher; Giovanni De Micheli

Hardware-Efficient Quantization for Green Custom Foundation Models

Toshiaki Koike-Akino, Chang Meng, Volkan Cevher, Giovanni De Micheli

Published: 21 Jun 2024, Last Modified: 26 Jul 2024ES-FoMo-II 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Green AI, Quantization, Foundation Models, Custom Design, Low Power Hardware

TL;DR: Hardware-aware quantization of large foundation models to achieve low-power full-custom design, showing 20-fold reduction in power consumption.

Abstract: We propose a new hardware-efficient quantization (HEQ) for low-power full-custom foundation models. The HEQ jointly optimizes multiplier hardware and weight quantization to minimize the total power consumption. Exploiting power profile of custom multipliers, our method achieves a significant power reduction up to 20 folds.

Submission Number: 17

Loading