- Abstract: Deep learning relies on resource-heavy linear algebra operations which can be prohibitively expensive when deploying to constrained embedded and mobile devices, or even when training large-scale networks. One way to reduce a neural network's resource requirements is to sparsify its weight matrices - a process often referred to as pruning. It is typically achieved by removing least important weights as measured by some salience criterion, with pruning by magnitude being the most popular option. This, however, often makes close to random judgments. In this paper we aim to closely investigate the concept of model weight importance, with a particular focus on the magnitude criterion and its most suitable substitute. To this end we identify a suitable Statistical framework and derive deep model parameter asymptotic theory to use with it. Thus, we derive a statistically-grounded pruning criterion which we compare with the magnitude pruning both qualitatively and quantitatively. We find this criterion to better capture parameter salience, by accounting for its estimation uncertainty. This results in improved performance and easier post-pruned re-training.