Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization

Alexander Long; Chamin P Hewa Koneputugodage; Thalaiyasingam Ajanthan; Yan Zuo; Gil Avraham; Violetta Shevchenko; Hadi Mohaghegh Dolatabadi; Sameera Ramasinghe

Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization

Alexander Long, Chamin P Hewa Koneputugodage, Thalaiyasingam Ajanthan, Yan Zuo, Gil Avraham, Violetta Shevchenko, Hadi Mohaghegh Dolatabadi, Sameera Ramasinghe

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Decentralized training, LLMs, Distributed training, Open source, Weight secrecy

TL;DR: We provide a decentralized framework for collaborators to run large model computation (training/inference) without any collaborator getting full access to the model.

Abstract: We consider a decentralized setup in which the participants collaboratively train and serve a large neural network, and where each participant only processes a subset of the model. In this setup, we explore the possibility of unmaterializable weights, where a full weight set is never available to any one participant. We introduce Unextractable Protocol Models (UPMs): a training and inference framework that leverages the sharded model setup to ensure model shards (i.e.,, subsets) held by participants are incompatible at different time steps. UPMs periodically inject time-varying, random, invertible transforms at participant boundaries; preserving the overall network function yet rendering cross-time assemblies incoherent. On Qwen-2.5-0.5B and Llama-3.2-1B, 10 000 transforms leave FP32 perplexity unchanged ($\Delta$PPL$< 0.01$; Jensen–Shannon drift $<4 \times 10^{-5}$), and we show how to control growth for lower precision datatypes. Applying a transform every 30s adds 3% latency, 0.1% bandwidth, and 10% GPU-memory overhead at inference, while training overhead falls to 1.6% time and < 1% memory. We consider several attacks, showing that the requirements of direct attacks are impractical and easy to defend against, and that gradient-based fine-tuning of stitched partitions consumes $\geq 60\%$ of the tokens required to train from scratch. By enabling models to be collaboratively trained yet not extracted, UPMs make it practical to embed programmatic incentive mechanisms in community-driven decentralized training.

Primary Area: Infrastructure (e.g., libraries, improved implementation and scalability, distributed solutions)

Submission Number: 25643

Loading