TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: The paper describes a method to verify LLM inference performed by untrusted compute providers
Abstract: Large language models (LLMs) have proven to be very capable, but access to frontier models currently relies on inference providers. This introduces trust challenges: how can we be sure that the provider is using the model configuration they claim? We propose TOPLOC, a novel method for verifiable inference that addresses this problem. TOPLOC leverages a compact locality-sensitive hashing mechanism for intermediate activations, which can detect unauthorized modifications to models, prompts, or precision with 100\% accuracy, achieving no false positives or negatives in our empirical evaluations. Our approach is robust across diverse hardware configurations, GPU types, and algebraic reorderings, which allows for validation speeds significantly faster than the original inference. By introducing a polynomial encoding scheme, TOPLOC minimizes the memory overhead of the generated proofs by $1000\times$, requiring only 258 bytes of storage per 32 new tokens, compared to the 262 KB requirement of storing the token embeddings directly for Llama 3.1-8B-Instruct. Our method empowers users to verify LLM inference computations efficiently, fostering greater trust and transparency in open ecosystems and laying a foundation for decentralized, verifiable and trustless AI services.
Lay Summary: Large language models now power many chatbots and writing tools, but these models are resource intensive and are usually run by companies that can benefit from scale. This creates a basic trust problem: how can we be sure that the company used the exact model and settings they claim? Our work introduces TopLoc, an add-on to the model execution that can give users that proof.  As the model generates text, TopLoc records tiny “digital fingerprints” of its internal calculations. These fingerprints can later be verified by other providers to identify if the model or prompt has been altered. This verification can be done at a fraction of the cost of the original computation. Because the fingerprints are compact they’re also easy to share and store.  With TopLoc, people and companies can trust outsourced AI services without having to take the provider’s word for it, paving the way for open, provably honest language‑model ecosystems.
Link To Code: https://github.com/PrimeIntellect-ai/toploc
Primary Area: Social Aspects->Accountability, Transparency, and Interpretability
Keywords: Locality Sensitive Hashing, Large Language Models, Verifiable Computing
Submission Number: 15401
Loading