Analog Foundation Models

Julian Büchel; Iason Chalas; Giovanni Acampa; An Chen; Omobayode Fagbohungbe; Hsinyu Tsai; Kaoutar El Maghraoui; Manuel Le Gallo; Abbas Rahimi; Abu Sebastian

Analog Foundation Models

Julian Büchel, Iason Chalas, Giovanni Acampa, An Chen, Omobayode Fagbohungbe, Hsinyu Tsai, Kaoutar El Maghraoui, Manuel Le Gallo, Abbas Rahimi, Abu Sebastian

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0

Keywords: analog in-memory computing, robustness, large language models, foundation models

TL;DR: We train analog foundation models that are robust to noise present in analog in-memory computing hardware and demonstrate accuracy comparable to models trained with 4-bit weight and 8-bit static input quantization.

Abstract: Analog in-memory computing (AIMC) is a promising compute paradigm to improve speed and power efficiency of neural network inference beyond the limits of conventional von Neumann-based architectures. However, AIMC introduces fundamental challenges such as noisy computations and strict constraints on input and output quantization. Because of these constraints and imprecisions, off-the-shelf LLMs are not able to achieve 4-bit-level performance when deployed on AIMC-based hardware. While researchers previously investigated recovering this accuracy gap on small, mostly vision-based models, a generic method applicable to LLMs pre-trained on trillions of tokens does not yet exist. In this work, we introduce a general and scalable method to robustly adapt LLMs for execution on noisy, low-precision analog hardware. Our approach enables state-of-the-art models — including Phi-3-mini-4k-instruct and Llama-3.2-1B-Instruct — to retain performance comparable to 4-bit weight, 8-bit activation baselines, despite the presence of analog noise and quantization constraints. Additionally, we show that as a byproduct of our training methodology, analog foundation models can be quantized for inference on low-precision digital hardware. Finally, we show that our models also benefit from test-time compute scaling, showing better scaling behavior than models trained with 4-bit weight and 8-bit static input quantization. Our work bridges the gap between high-capacity LLMs and efficient analog hardware, offering a path toward energy-efficient foundation models. Code is available at [github.com/IBM/analog-foundation-models](https://github.com/IBM/analog-foundation-models).

Primary Area: Other (please use sparingly, only use the keyword field for more details)

Submission Number: 13249

Loading