Abstract: Preconditioned Conjugate Gradient (PCG) method is a widely used iterative method for solving large linear systems of equations. Pipelined variants of PCG present independent computations in the PCG method and overlap these computations with non-blocking allreduces. We have developed a novel pipelined PCG algorithm called PIPE-sCG (Pipelined s-step Conjugate Gradient) that provides a large overlap of global communication and computations at higher number of cores in distributed memory CPU systems. Our method achieves this overlap by introducing new recurrence computations. We have also developed a preconditioned version of PIPE-sCG. The advantages of our methods are that they do not introduce any extra preconditioner or sparse matrix vector product kernels in order to provide the overlap and can work with preconditioned, unpreconditioned and natural norms of the residual, as opposed to the state-of-the-art methods. We compare our method with other pipelined CG methods for Poisson problems and demonstrate that our method gives the least runtimes. Our method gives up to 2.9x speedup over PCG method, 2.15x speedup over PIPECG method and 1.2x speedup over PIPECG-OATI method at large number of cores.
Loading