An analog-AI chip for energy-efficient speech recognition and transcription

Stefano Ambrogio; Pritish Narayanan; Atsuya Okazaki; Andrea Fasoli; Charles Mackin; Kohji Hosokawa; Akiyo Nomura; Takeo Yasuda; An Chen; Alexander M. Friz; Masatoshi Ishii; Jose Luquin; Yasuteru Kohda; Nicole Saulnier; Kevin Brew; Samuel Choi; Injo Ok; Timothy Philip; Victor Chan; Mary Claire Silvestre; Ishtiaq Ahsan; Vijay Narayanan; Hsinyu Tsai; Geoffrey W. Burr

An analog-AI chip for energy-efficient speech recognition and transcription

Published: 01 Jan 2023, Last Modified: 12 May 2025Nat. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Models of artificial intelligence (AI) that have billions of parameters can achieve high accuracy across a range of tasks1,2, but they exacerbate the poor energy efficiency of conventional general-purpose processors, such as graphics processing units or central processing units. Analog in-memory computing (analog-AI)3–7 can provide better energy efficiency by performing matrix–vector multiplications in parallel on ‘memory tiles’. However, analog-AI has yet to demonstrate software-equivalent (SWeq) accuracy on models that require many such tiles and efficient communication of neural-network activations between the tiles. Here we present an analog-AI chip that combines 35 million phase-change memory devices across 34 tiles, massively parallel inter-tile communication and analog, low-power peripheral circuitry that can achieve up to 12.4 tera-operations per second per watt (TOPS/W) chip-sustained performance. We demonstrate fully end-to-end SWeq accuracy for a small keyword-spotting network and near-SWeq accuracy on the much larger MLPerf8 recurrent neural-network transducer (RNNT), with more than 45 million weights mapped onto more than 140 million phase-change memory devices across five chips. A low-power chip that runs AI models using analog rather than digital computation shows comparable accuracy on speech-recognition tasks but is more than 14 times as energy efficient.

Loading