Keywords: Deep Learning, Game of Life
Abstract: Efforts to improve the learning abilities of neural networks have focused mostly on the role of optimization methods rather than on weight initializations. Recent findings, however, suggest that neural networks rely on lucky random initial weights of subnetworks called "lottery tickets" that converge quickly to a solution. To investigate how weight initializations affect performance, we examine small convolutional networks that are trained to predict $n$ steps of the two-dimensional cellular automaton Conway’s Game of Life, the update rules of which can be implemented efficiently in a small CNN. We find that networks of this architecture trained on this task rarely converge. Rather, networks require substantially more parameters to consistently converge. Furthermore, we find that the initialization parameters that gradient descent converges to a solution are sensitive to small perturbations, such as a single sign change. Finally, we observe a critical value $d_0$ such that training minimal networks with examples in which cells are alive with probability $d_0$ dramatically increases the chance of convergence to a solution. Our results are consistent with the lottery ticket hypothesis.
One-sentence Summary: We show that Conway's Game of Life can be represented by a simple neural network, yet find that traditional gradient descent methods do not often converge on a solution without significant overparameterization.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=cjnARGu3LQ
5 Replies
Loading