Using long vector extensions for MPI reductions

Published: 01 Jan 2022, Last Modified: 14 May 2025Parallel Comput. 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Design and investigation of vector-based reduction operation for MPI reduction.•Implementation using Intel AVXs and Arm SVE to demonstrate the efficiency of our vectorized reduction operation.•Experiments with MPI benchmarks, performance tool, HPC and deep learning application.•Experiments with different architectures (x86 and aarch64) and processors including Intel Xeon Gold, AMD Zen 2, and Arm A64FX.
Loading