University of Cambridge > Talks.cam > Engineering Department Nuclear Energy Seminars > The Rise of Portable GPU Programming: Experiences Developing GPU-Based Scientific Simulation Applications for Intel, NVIDIA, and AMD GPUs

The Rise of Portable GPU Programming: Experiences Developing GPU-Based Scientific Simulation Applications for Intel, NVIDIA, and AMD GPUs

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Jo Boyle.

A Sandwich lunch will be available from 12:45

Historically, portability has not been important for GPU programming as NVIDIA has dominated the high performance computing (HPC) GPU market. In this context, it has always made sense to develop scientific HPC apps using NVIDIA ’s proprietary CUDA programming model. However, in 2022 both AMD and Intel are releasing HPC GPU products with the intention of competing directly with NVIDIA . In fact, the world’s first exascale supercomputer (Oak Ridge National Laboratory’s Frontier) is powered by AMD GP Us, with another even larger exascale supercomputer (Aurora) powered by Intel GPUs set to arrive at Argonne National Laboratory shortly. These new computers highlight a trend not just from CPU to GPU in HPC , but also a trend from proprietary CUDA into a number of different portable performance models for GPU . Thus, scientific application developers are now confronted with not only the difficultly of porting or developing apps for GPU architectures, but also with selecting from a wide variety of portable GPU programming models (for instance, OpenMP offloading, HIP , SYCL/DPC++, OpenCL, Kokkos, and RAJA ).

In this talk, I will briefly introduce the newest supercomputing systems and will give an overview of the many different portable performance models now available for GPUs. I will show a few snippets of an example kernel implemented in a variety of different models, and will even compare performance of a scientific mini-app, XSBench, across all major programming models and GPU architectures. Subjective “pros and cons” of each programming model will be discussed along with quantitative performance comparisons. Next, I will use a full scientific GPU application (the OpenMC Monte Carlo particle transport code) as a case study to discuss real-world issues affecting portable scientific GPU applications and how bleeding-edge GPU compiler technology stacks are faring. I will also briefly discuss a few of the algorithmic performance optimizations that were developed for OpenMC to give a feel for what types of changes are required to achieve high performance on modern GPUs.

For further information above this talk please email Dr Paul Cosgrove: pmc55@cam.ac.uk

This talk is part of the Engineering Department Nuclear Energy Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity