zkGPT: An Efficient Non-interactive Zero-knowledge Proof Framework for LLM Inference

Published: 2025, Last Modified: 25 Jan 2026IACR Cryptol. ePrint Arch. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Large Language Models (LLMs) are widely employed for their ability to generate human-like text. However, service providers may deploy smaller models to reduce costs, potentially deceiving users. Zero-Knowledge Proofs (ZKPs) offer a solution by allowing providers to prove LLM inference without compromising the privacy of model parameters. Existing solutions either do not support LLM architectures or suffer from significant inefficiency and tremendous overhead. To address this issue, this paper introduces several new techniques. We propose new methods to efficiently prove linear and non-linear layers in LLMs, reducing computation overhead by orders of magnitude. To further enhance efficiency, we propose constraint fusion to reduce the overhead of proving non-linear layers and circuit squeeze to improve parallelism. We implement our efficient protocol, specifically tailored for popular LLM architectures like GPT-2, and deploy optimizations to enhance performance. Experiments show that our scheme can prove GPT-2 inference in less than 25 seconds. Compared with state-of-the-art systems such as Hao et al. (USENIX Security'24) and ZKML (Eurosys'24), our work achieves nearly $279\times$ and $185\times$ speedup, respectively.
Loading