Towards Scalable Unstructured Mesh Computations on Shared Memory Many-Cores

Haozhong Qiu, Chuanfu Xu, Jianbin Fang, Liang Deng, Jian Zhang, Qingsong Wang, Yue Ding, Zhe Dai, Yonggang Che, Shizhao Chen, Jie Liu

Published: 01 Jan 2024, Last Modified: 11 Nov 2025PPoPP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Due to data conflicts or data dependences, exploiting shared memory parallelism on unstructured mesh applications is highly challenging. The prior approaches are neither general nor scalable on emerging many-core processors. This paper presents a general and scalable shared memory approach for unstructured mesh computations. We recursively divide and reorder an unstructured mesh to construct a task dependency tree (TDT), where massive parallelism is exposed and data conflicts as well as data dependences are respected. We propose two recursion strategies to support popular programming models on both CPUs and GPUs for TDT. We evaluate our approach by applying it to an industrial unstructured Computational Fluid Dynamics (CFD) software. Experimental results show that our approach significantly outperforms the prior shared memory approaches, delivering up to 8.1× performance improvement over the engineer-tuned implementations.