Targeting GPUs using OpenMP Directives on Summit with GenASiS

Published on Tuesday 18 December 2018

By Ganesan Narayanasamy, senior technical computing solution and client care manager, IBM

In the lead up to SC18 we held the 3rd OpenPOWER Academic Discussion Group Workshop. It was a perfect opportunity for members of academia working in supercomputing to share recent successes they have had developing on OpenPOWER platforms.

One such session was led by Reuben Budiardja, a computational scientist in the National Center for Computational Sciences at Oak Ridge National Laboratory. He is the lead developer of GenASiS, the General Astrophysics Simulation System, which has been used to study the role of fluid instabilities in supernova dynamics. GenASiS is written entirely in modern Fortran and, until recently, was CPU-only code.

Budiardja and his colleague Christian Cardall identified three potential paths that could be explored to transition to GPUs:

CUDA – would require a rewrite of all computational kernels, a loss of Fortran semantics and interfacing with the rest of the Fortran code.
CUDA Fortran – would be a non-standard extension to Fortran and would not easily fall back to standard Fortran.
Directives (OpenMP) – would allow retention of Fortran semantics, and OpenMP 4.5 has excellent support for modern Fortran.

Using OpenMP Directives on Summit, the most powerful supercomputer in the world, produced strong results. In testing the 3D scaling of the RiemannProblem, the team realized a speed-up from 3.92X – 6.71X from 7 CPU to GPU.

Pinned Memory was then used to take these results even further. While there is not yet a mechanism by which to use Pinned Memory in OpenMP, the team added a Fortran wrapper in GenASiS to optimize data transfers. Doing so provided an additional speed-up of 1.7X – 2.0X, providing an overall speed-up of over 9X from 7 CPU threads.

Budiardga concluded that OpenMP allows simple and effective porting of Fortran code to target GPUs, and this work has many implications. It will enable the team to perform higher-fidelity simulations and ensemble studies for trends in observables. In fact, the team is planning to perform ~200 2D grey transport supernova simulations, tens of 3D grey transport, and a handful of 3D spectral transport simulations. Moreover, this is the first step towards full Boltzmann radiation transport with exascale computing.

View Mr. Budiardga’s full session video and slides below.

Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and Effective Fortran Experience from Ganesan Narayanasamy

featured