Automatic Differentiation of Parallelised Convolutional Neural Networks - Lessons from Adjoint PDE Solvers

Jan Hückelheim, Paul Hovland

Oct 28, 2017 (modified: Oct 28, 2017) NIPS 2017 Workshop Autodiff Submission readers: everyone
  • Abstract: Convolutional Neural Networks often consist of several layers whose data access patterns closely resemble that of structured-mesh solvers for partial differential equation (PDE) solvers. Past work by us and others showed that the reverse-mode automatic differentiation of such solvers is challenging if the forward model uses shared-memory parallelisation. This is because communication between threads happens implicitly through shared memory locations, and is not always detectable at compile-time. This work presents an overview of past results on the relationship between data access in the forward and reverse computations, and methods to transform shared-memory-parallel forward models into equally scalable reverse models, with a case study on multi-core CPUs and Intel XeonPhi processors.
  • TL;DR: We summarise previous covering the automatic differentiation of shared-memory parallel code, and point out applications in Convolutional Neural Networks.
  • Keywords: Convolutional Neural Networks, Automatic Differentiation, Parallelisation