A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with Kernel Tuning Toolkit
Abstract: Highlights•Introduces dynamic autotuning of OpenCL or CUDA kernels with KTT framework.•Introduces a set of ten highly-efficient tunable benchmarks.•Evaluates benchmarks’ performance portability using various GPUs, CPU, and Xeon Phi.•Demonstrates dynamic autotuning with a real-world application.
Loading