Efficient Parallel Out-of-Core Matrix Transposition

Published: 01 Jan 2003, Last Modified: 13 Nov 2024CLUSTER 2003EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper addresses the problem of parallel transposition of large out-of-core arrays. Although algorithms for out-of-core matrix transposition have been widely studied, previously proposed algorithms have sought to minimize the number of I/O operations and the in-memory permutation time. We propose an algorithm that directly targets the improvement of overall transposition time. The I/O characteristics of the system are used to determine the read, write and communication block sizes such that the total execution time is minimized. We also provide a solution to the array redistribution problem for arrays on disk. The solution to the sequential transposition problem and the parallel array redistribution problem are then combined to obtain an algorithm for the parallel out-of-core transposition problem.
Loading