Learning Provably Correct Distributed Protocols Without Human Knowledge

Yujie Hui; Xiaoyi Lu; Andrew Perrault; Yang Wang

Learning Provably Correct Distributed Protocols Without Human Knowledge

Yujie Hui, Xiaoyi Lu, Andrew Perrault, Yang Wang

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multiagent, Distributed System, Reinforcement Learning

Abstract: Provably correct distributed protocols, which are a critical component of modern distributed systems, are highly challenging to design, and have often required decades of human effort. These protocols allow multiple agents to coordinate to come to a common agreement in an environment with uncertainty and failures, e.g., agents only observe their own state, messages can be lost, and agents may crash. We formulate protocol design as a search problem over strategies in a game with imperfect information, and the desired correctness conditions are specified in satisfiability modulo theories. However, standard methods for solving multi-agent games fail to learn correct protocols in this setting, even with then number of agents is small. We develop a learning process we call GGMS that combines a specialized version of Monte Carlo Tree Search with a transformer action encoder, a global depth-first search to break out of local minima, and repeated feedback from a model checker. We show that, under mild assumptions, this process will not miss correct solutions. In experimental validation, we show that GGMS can learn correct protocols for larger settings than existing methods.

Supplementary Material: zip

Primary Area: applications to robotics, autonomy, planning

Submission Number: 10329

Loading