Mobile Networks for Computer Go

Tristan Cazenave

Published: 2022, Last Modified: 30 Sept 2024IEEE Trans. Games 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The architecture of the neural networks used in deep reinforcement learning programs such as AlphaZero or Polygames has been shown to have great impact on the performances of the resulting playing engines. For example, the use of residual networks gave a 600 ELO increase in the strength of AlphaGo. This article proposes to evaluate the interest of mobile networks for the game of Go using supervised learning as well as the use of a policy head and value head different from the AlphaZero heads. The accuracy of the policy, mean squared error of the value, efficiency of the networks with the number of parameters, playing speed, and strength of the trained networks are evaluated.