LCPC 2008 - CUDA-lite: Reducing GPU programming complexity

CUDA-lite: Reducing GPU programming complexity

Sain-Zee Ueng, Melvin Lathara, Sara Baghsorkhi and Wen-mei Hwu

21th Annual Workshop on Languages and Compilers for Parallel Computing (LCPC 2008)
Edmonton, Alberta, Canada, July 31 - August 2, 2008

Summary

The computer industry has transitioned into multi-core and many-core parallel systems. Unfortunately, programming these systems is still a ma jor hurdle. The CUDA programming environment from NVIDIA is an attempt to make programming GPUs more accessible to program- mers. However, there are still many burdens placed upon the program- mer to maximize performance when using CUDA. One such burden for CUDA programmers is dealing with the complex memory hierarchy. Ef- ficient and correct usage of the various memories is essential, being a difference of 2-17x in performance. Currently, the task of determining the appropriate memory to use and the coding of data transfer between memories is still left to the programmer. We believe that this task can be better performed by automated tools, leaving the programmer to interact with the easy-to-reason high-level global memory. We present CUDA-lite, an enhancement to CUDA, as one such tool. We leverage programmer knowledge via annotations to perform the transformations and show preliminary results that auto-generated code can have perfor- mance comparable to hand coding.

START Conference Manager (V2.54.5)