Abstract: We introduce adaptive weight sparsity, an algorithm that allows a neural network to learn a sparse connection pattern during training. We demonstrate that the proposed algorithm shows performance benefits across a wide variety of tasks and network structures, improving state-of-the-art results for recurrent networks of comparable size. We show that adaptive weight sparsity outperforms traditional pruning-based approaches to learning sparse configurations on convolutional and recurrent networks. We offer insights into the algorithm's behavior, demonstrating that training-time adaptivity is crucial to the success of the method and uncovering an interpretable evolution toward small-world network structures.
Keywords: deep learning, sparsity, adaptive methods
4 Replies
Loading