Abstract: Stencil computations are ubiquitous in modern grid-based physical simulations. In this paper, we present FOURST – a compiler to generate programs computing time iterated linear periodic and aperiodic stencil computations with fast Fourier transform methods. This paper outlines the design and implementation of the code generation approach in FOURST, to automatically generate FFT-based stencil solvers. We present experimental results on the state-of-the-art Ookami supercomputer housing Fujitsu A64FX and Intel Skylake processors, to study the performance of FOURST and a state-of-the-art tiling-based optimized code generator PLuTo on various stencil shapes and varying the number of time iterations. We discuss the performance profiles, and limitations, of both approaches on high-end modern hardware.
0 Replies
Loading