Generating Local Address and Communication Sets for Data-Parallel Programs

Published: 1995, Last Modified: 10 Feb 2025J. Parallel Distributed Comput. 1995EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Generating local addresses and communication sets is an important issue in distributed-memory implementations of data-parallel languages such as High Performance Fortran. We demonstrate a storage scheme for an array A affinely aligned to a template that is distributed across p processors with a cyclic(k) distribution that does not waste any storage, and show that, under this storage scheme, the local memory access sequence of any processor for a computation involving the regular section A(ℓ:h:s) is characterized by a finite state machine of at most k states. We present fast algorithms for computing the essential information about these state machines, and we extend the framework to handle multidimensional arrays. We also show how to generate communication sets using the state machine approach. Performance results show that this solution requires very little runtime overhead and acceptable preprocessing time.
Loading