Not All Lotteries Are Made Equal
Keywords: lottery ticket hypothesis, sparsity, pruning
Abstract: The Lottery Ticket Hypothesis (LTH) states that for a reasonably sized neural network, there exists a subnetwork within the same network that yields no less performance than the dense counterpart when trained from the same initialization. To the best of our knowledge, prior work regarding the LTH has only investigated overparameterized models, and the emergence of the LTH is often attributed to the initial model being large, i.e., a dense sampling of tickets. In this blog post, we present evidence that challenges this notion of the LTH. We investigate the effect of model size and the ease of finding winning tickets. Through this work, we show that winning tickets is easier to find for smaller models.
Submission Full: zip
Blogpost Url: yml
ICLR Paper: https://arxiv.org/abs/2010.07611, https://arxiv.org/abs/1803.03635
2 Replies
Loading