Adaptive Resource Scheduling in Permissionless Sharded-Blockchains: A Decentralized Multiagent Deep Reinforcement Learning Approach
Abstract: Existing permissionless sharded-Blockchains come on the scene. However, there is a lack of systematic formulations and experiments regarding the behaviors of individual miners. In this article, we interpret block mining in a permissionless sharded-Blockchain as a repeated $M$ -player noncooperative game with finite actions, and propose a new multiagent deep reinforcement learning (MADRL) framework to allow the miners to maximize their profits in a decentralized fashion by scheduling their resources across the shards without centralized coordination. We formulate the rewards, and design a two-scale action space for each miner to reduce the action space and expedite convergence. We also propose a new MADRL model, named Rainbow-WoLF-PHC, which allows each miner to learn its resource allocation online and converge fast to a mixed strategy Nash equilibrium. Extensive experiments show the superiority of the Rainbow-WoLF-PHC to its alternatives in terms of convergence, stability, and profitable actions. This work provides a prosperous design of an end-user-friendly permissionless sharded-Blockchain.
Loading