Keywords: LLM, distributed training, incentive mechanisms, pseudo-gradient aggregation, communication-efficient optimization, OpenSkill rating, proof of computation, byzantine robustness, LLM pre-training
TL;DR: Gauntlet is an incentive system for permissionless distributed LLM training: untrusted peers submit pseudo-gradients, a validator scores quality and honest computation, and the model updates using sparse aggregation of top-rated peers contributions.
Abstract: We describe an incentive system for distributed deep learning of foundational models where peers are rewarded for contributions. The incentive system, Gauntle, has been deployed on the bittensor blockchain and used to train a 1.2B LLM with completely permissionless contributions of pseudo-gradients: no control over the users that can register or their hardware. Gauntlet can be applied to any synchronous distributed training scheme that relies on aggregating updates or pseudo-gradients. We rely on a two-stage mechanism for fast filtering of peer uptime, reliability, and synchronization, combined with the core component that estimates the loss before and after individual pseudo-gradient contributions. We utilized an OpenSkill rating system to track competitiveness of pseudo-gradient scores across time. Finally, we introduce a novel mechanism to ensure peers on the network perform unique computations. Our live 1.2B training run, which has paid out real-valued monetary rewards to participants based on the value of their contributions, yielded a competitive (on a per-iteration basis) 1.2B model that demonstrates the utility of our incentive system.
Submission Number: 138
Loading