Use cases of lossy compression for floating-point data in scientific data sets

Franck Cappello, Sheng Di, Sihuan Li, Xin Liang, Ali Murat Gok, Dingwen Tao, Chun Hong Yoon, Xin Chuan Wu, Yuri Alexeev, Frederic T. Chong

Research output: Contribution to journalArticlepeer-review

70 Scopus citations

Abstract

Architectural and technological trends of systems used for scientific computing call for a significant reduction of scientific data sets that are composed mainly of floating-point data. This article surveys and presents experimental results of currently identified use cases of generic lossy compression to address the different limitations of scientific computing systems. The article shows from a collection of experiments run on parallel systems of a leadership facility that lossy data compression not only can reduce the footprint of scientific data sets on storage but also can reduce I/O and checkpoint/restart times, accelerate computation, and even allow significantly larger problems to be run than without lossy compression. These results suggest that lossy compression will become an important technology in many aspects of high performance scientific computing. Because the constraints for each use case are different and often conflicting, this collection of results also indicates the need for more specialization of the compression pipelines.

Original languageEnglish
Pages (from-to)1201-1220
Number of pages20
JournalInternational Journal of High Performance Computing Applications
Volume33
Issue number6
DOIs
StatePublished - Nov 1 2019

Bibliographical note

Funding Information:
We acknowledge the computing resources provided on Bebop, which is operated by the Laboratory Computing Resource Center at Argonne National Laboratory. The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations—the Office of Science and the National Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, to support the nation’s exascale computing imperative. The material was based upon work supported by the US Department of Energy, Office of Science, under contract DE-AC02-06CH11357 and supported by the National Science Foundation under Grant No. 1619253.

Publisher Copyright:
© The Author(s) 2019.

Keywords

  • Lossy compression
  • applications
  • floating-point data
  • scientific data set
  • use cases

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Use cases of lossy compression for floating-point data in scientific data sets'. Together they form a unique fingerprint.

Cite this