Abstract: We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n logn) time cache-obliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O(logn loglogn), which improves on previous bounds for deterministic sample sort. Given a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, our algorithm can be scheduled effectively on these p cores in a cache-oblivious manner.
Loading