Abstract: We introduce a novel GPU path rendering method based on scan-line rasterization, which is highly work-efficient but traditionally considered as GPU hostile. Our method is parallelized over boundary fragments, i.e., pixels directly intersecting the path boundary. Non-boundary pixels are processed in bulk as horizontal spans like in CPU scanline rasterizers, which saves a significant amount of winding number computation workload. The distinction also allows the majority of our algorithmic steps to focus on boundary fragments only, which leads to highly balanced workload among the GPU threads. In addition, we develop a ray shooting pattern that minimizes the global data dependency when computing winding numbers at anti-aliasing samples. This allows us to shift the majority of winding-number-related workload to the same kernel that consumes its result, which saves a significant amount of GPU memory bandwidth. Experiments show that our method gives a consistent 2.5X speedup over state-of-the-art alternatives for high-quality rendering at Ultra HD resolution, which can increase to more than 30X in extreme cases. We can also get a consistent 10X speedup on animated input.
0 Replies
Loading