Abstract: Large graphs with billions of nodes and edges are increasingly common, calling for new kinds of scalable computation frameworks. Although popular, distributed approaches can be expensive to build, or require many resources to manage or tune. State-of-the-art approaches such as GraphChi and TurboGraph recently have demonstrated that a single machine can efficiently perform advanced computation on billion-node graphs. Although fast, they both use sophisticated data structures, memory management, and optimization techniques. We propose a minimalist approach that forgoes such complexities, by leveraging the memory mapping capability found on operating systems. Our experiments on large datasets, such as a 1.5 billion edge Twitter graph, show that our streamlined approach achieves up to 26 times faster than GraphChi, and comparable to TurboGraph. We contribute our crucial insight that by leveraging memory mapping, a fundamental operating system capability, we can outperform the latest graph computation techniques.
0 Replies
Loading