Tiling Multidimensional Itertion Spaces for Multicomputers

Published: 01 Jan 1992, Last Modified: 13 Nov 2024J. Parallel Distributed Comput. 1992EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper addresses the problem of compiling perfectly nested loops for multicomputers (distributed-memory machines). The relatively high communication start-up costs in these machines renders frequent communication very expensive. Motivated by this concern, we present a method of aggregating a number of loop iterations into tiles where the tiles execute atomically-a processor executing the iterations belonging to a tile receives all the data it needs before executing any one of the iterations in the tile, executes all the iterations in the tile, and then sends the data needed by other processors. Since synchronization is not allowed during the execution of a tile, partitioning the iteration space into tiles must not result in deadlock. We first show the equivalence between the problem of finding partitions and the problem of determining the cone for a given set of dependence vectors. We then present an approach to partitioning the iteration space into deadlock-free tiles so that communication volume is minimized. In addition, we discuss a method for optimizing the size of tiles for nested loops on multicomputers. This work differs from other approaches to tiling in that we present a method of optimizing grain size of tiles for multicomputers.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview