Efficient and portable GEMM-based convolution operators for deep neural network training on multicore processors
Abstract: Highlights•We present DECONVGEMM, an operator integrating the col2im transform and the BLIS GEMM.•We present a reindex transform and a transposed CONVGEMM for gradient computation.•We provide implementation descriptions of the novel CONVGEMM/DECONVGEMM operators.•We integrate our operators in PyDTNN: Python Distributed Training of Neural Networks.•We evaluate the performance and memory savings of the CONVGEMM/DECONVGEMM operators.
Loading