TY - GEN
T1 - MIMD interpretation on a GPU
AU - Dietz, Henry G.
AU - Young, B. Dalton
PY - 2010
Y1 - 2010
N2 - Programming heterogeneous parallel computer systems is notoriously difficult, but MIMD models have proven to be portable across multi-core processors, clusters, and massively parallel systems. It would be highly desirable for GPUs (Graphics Processing Units) also to be able to leverage algorithms and programming tools designed for MIMD targets. Unfortunately, most GPU hardware implements a very restrictive multi-threaded SIMD-based execution model. This paper presents a compiler, assembler, and interpreter system that allows a GPU to implement a richly featured MIMD execution model that supports shared-memory communication, recursion, etc. Through a variety of careful design choices and optimizations, reasonable efficiency is obtained on NVIDIA CUDA GPUs. The discussion covers both the methods used and the motivation in terms of the relevant aspects of GPU architecture.
AB - Programming heterogeneous parallel computer systems is notoriously difficult, but MIMD models have proven to be portable across multi-core processors, clusters, and massively parallel systems. It would be highly desirable for GPUs (Graphics Processing Units) also to be able to leverage algorithms and programming tools designed for MIMD targets. Unfortunately, most GPU hardware implements a very restrictive multi-threaded SIMD-based execution model. This paper presents a compiler, assembler, and interpreter system that allows a GPU to implement a richly featured MIMD execution model that supports shared-memory communication, recursion, etc. Through a variety of careful design choices and optimizations, reasonable efficiency is obtained on NVIDIA CUDA GPUs. The discussion covers both the methods used and the motivation in terms of the relevant aspects of GPU architecture.
UR - http://www.scopus.com/inward/record.url?scp=77954411660&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954411660&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-13374-9_5
DO - 10.1007/978-3-642-13374-9_5
M3 - Conference contribution
AN - SCOPUS:77954411660
SN - 3642133738
SN - 9783642133732
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 65
EP - 79
BT - Languages and Compilers for Parallel Computing - 22nd International Workshop, LCPC 2009, Revised Selected Papers
T2 - 22nd International Workshop on Languages and Compilers for Parallel Computing, LCPC 2009
Y2 - 8 October 2009 through 10 October 2009
ER -