Slicing and dicing openHPC infrastructure: Virtual clusters in OpenStack

Satrio Husodo, Jacob Chappell, Vikram Gazula, Lowell Pike, James Griffioen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

University research computing centers are increasingly faced with the need to support applications that are better suited for cloud infrastructure than HPC infrastructure. A common approach is to shoehorn cloud-based applications onto the university’s existing HPC system, which has been done with varying levels of success. Another approach as been to create stand-alone HPC systems and private cloud systems, resulting in ineffective use of resources. A more recent approach has been to use hybrid systems where the HPC system “bursts” excess jobs to private cloud nodes configured as bare-metal nodes built from the same (expensive) hardware as the HPC system. This paper explores another model, namely the use of private cloud infrastructure (built from inexpensive commodity networks and storage systems) to host both HPC jobs and VMs simultaneously Utilizing VMs allows these emerging applications to leverage cloud frameworks specifically designed for them (e.g., OpenStack, Kubernetes, Mesos, Hadoop, and Spark), while at the same time effectively supporting a growing percentage of the HPC jobs (e.g., single node jobs, and embarrassingly parallel jobs). Because the system can be constructed from commodity cloud networks and storage, it makes cost-effective use of the resources as opposed to HPC systems used to run jobs that do not use (waste) its expensive resources. To demonstrate the advantages of using cloud infrastructure for both cloud applications and HPC applications, we describe a system that can dynamically launch OpenHPC systems on commodity OpenStack infrastructure. Moreover, users can use the system to deploy “personal” OpenHPC clusters, customized to their application’s needs (e.g., number of nodes, cores per node, memory per node). We have used the system to effectively run OpenHPC workloads on a cluster of large memory OpenStack nodes, allowing users to create, for example, a large memory HPC-style cluster of 500 GB nodes running OpenHPC, and a cluster of 1TB VMs operating simultaneously. Performance degradation due to virtualization has been insignificant, particularly when compared to the advantages of being able to use optimized frameworks running on cost-effective hardware.

Original languageEnglish
Title of host publicationProceedings of the Practice and Experience in Advanced Research Computing
Subtitle of host publicationRise of the Machines (Learning), PEARC 2019
ISBN (Electronic)9781450372275
DOIs
StatePublished - Jul 28 2019
Event2019 Conference on Practice and Experience in Advanced Research Computing: Rise of the Machines (Learning), PEARC 2019 - Chicago, United States
Duration: Jul 28 2019Aug 1 2019

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2019 Conference on Practice and Experience in Advanced Research Computing: Rise of the Machines (Learning), PEARC 2019
Country/TerritoryUnited States
CityChicago
Period7/28/198/1/19

Bibliographical note

Publisher Copyright:
© 2019 Association for Computing Machinery.

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Slicing and dicing openHPC infrastructure: Virtual clusters in OpenStack'. Together they form a unique fingerprint.

Cite this