Abstract: In order to solve the low accuracy problem of GPU-based FFT, a mixed precision method is employed in this paper. A "precision cache" method is proposed as the supplement of the mixed precision method to calculate of twiddle factor. A work group split method is used to reduce the latency of access global memory frequently. The mixed precision FFT achieves 3 times accuracy improvement compared to CUFFT3.2 and 4 times peak performance to MKL FFT. The experiment shows that mixed precision method on CPU-GPU heterogeneous platform achieves high accuracy and efficient Fast Fourier Transform.
0 Replies
Loading