Estimating the parallel performance of IQMR method for unsymmetric large and sparse linear systems

Published: 01 Jan 2000, Last Modified: 06 Feb 2025ICPADS Workshops 2000EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: For the solutions of linear systems of equations with unsymmetric coefficient matrices, we have proposed an improved version of the quasi-minimal residual (IQMR) method by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. The algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time. In this paper, we mainly present the qualitative analysis of the parallel performance with Store-and-Forward routing and Cut-Through routing schemes and topologies such as ring, mesh, hypercube and balanced binary tree. Theoretically it is shown that the hypercube topology can give us the best parallel performance with regards to parallel efficiency, speed-up, and runtime, respectively. We also study theoretical aspects of the overlapping effect in the algorithm. Some timing results are shown to verify the theoretical studies.
Loading