Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation

Ziyu Guo, Xipeng Shen

Published: 2011, Last Modified: 06 Nov 2024LCPC 2011EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: GPU-to-CPU translation may extend Graphics Processing Units (GPU) programs executions to multi-/many-core CPUs, and hence enable cross-device task migration and promote whole-system synergy. This paper describes some of our findings in treatment to GPU synchronizations during the translation process. We show that careful dependence analysis may allow a fine-grained treatment to synchronizations and reveal redundant computation at the instruction-instance level. Based on thread-level dependence graphs, we present a method to enable such fine-grained treatment automatically. Experiments demonstrate that compared to existing translations, the new approach can yield speedup of a factor of integers.