Abstract: Modern shared-memory platforms embrace the Non-uniform Memory Access (NUMA) architecture - they have physically distributed, yet cache-coherent shared-memory. This paper explores the feasibility of a shared-memory graph processing engine for NUMA platforms inspired by designs that target zero-sharing platforms. This work exploits the characteristics of two processing modes, synchronous and asynchronous, in the context of the shared-memory NUMA platform. Depending on the algorithm, phase of execution, and graph topology, synchronous and asynchronous modes hold unique advantages over one another. We then explore a hybrid solution that combines synchronous and asynchronous processing within the same graph computation task and harness optimizations therein. An extensive evaluation using graphs with billions of edges and empirical comparisons with several state-of-the-art solutions demonstrate the performance advantages of our design.
0 Replies
Loading