Abstract: Our brain executes very sparse computation, allowing for great speed and energy savings. Deep neural networks can also be made to exhibit high levels of sparsity without significant accuracy loss. As their size grows, it is becoming imperative that we use sparsity to improve their efficiency. This is a challenging task because the memory systems and SIMD operations that dominate todays CPUs and GPUs do not lend themselves easily to the irregular data patterns sparsity introduces. This talk will survey the role of sparsity in neural network computation, and the parallel algorithms and hardware features that nevertheless allow us to make effective use of it.
Loading