GPU computing · MPI · parallel computing · Performance

CUDA-aware MPI

CUDA and MPI provide two different APIs for parallel programming that target very different parallel architectures. While CUDA allows to utilize parallel graphics hardware for general purpose computing, MPI is usually employed to write parallel programs that run on large SMP systems or on cluster computers. In order to improve a cluster’s overall computational capabilities… Continue reading CUDA-aware MPI

GPU computing

Accelerating the Fourier split operator method

In past postings I wrote on fast Fourier transform (FFT) performance and on GPU computing. In a new project both topics meet. We evaluated the FFT performance on GPUs. We found performance gains of more than one order of magnitude as compared to traditional (non-parallel) CPU codes. The FFT is a core algorithm that finds… Continue reading Accelerating the Fourier split operator method