Wavesz: A hardware-algorithm co-design of efficient lossy compression for scientific data

Jiannan Tian, Sheng Di, Chengming Zhang, Xin Liang, Sian Jin, Dazhao Cheng, Dingwen Tao, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

20 Scopus citations

Abstract

Error-bounded lossy compression is critical to the success of extreme-scale scientific research because of ever-increasing volumes of data produced by today’s high-performance computing (HPC) applications. Not only can error-controlled lossy compressors significantly reduce the I/O and storage burden but they can retain high data fidelity for post analysis. Existing state-of-the-art lossy compressors, however, generally suffer from relatively low compression and decompression throughput (up to hundreds of megabytes per second on a single CPU core), which considerably restrict the adoption of lossy compression by many HPC applications especially those with a fairly high data production rate. In this paper, we propose a highly efficient lossy compression approach based on field programmable gate arrays (FPGAs) under the state-of-the-art lossy compression model SZ. Our contributions are fourfold. (1) We adopt a wavefront memory layout to alleviate the data dependency during the prediction for higher-dimensional predictors, such as the Lorenzo predictor. (2) We propose a co-design framework named waveSZ based on the wavefront memory layout and the characteristics of SZ algorithm and carefully implement it by using high-level synthesis. (3) We propose a hardware-algorithm co-optimization method to improve the performance. (4) We evaluate our proposed waveSZ on three real-world HPC simulation datasets from the Scientific Data Reduction Benchmarks and compare it with other state-of-the-art methods on both CPUs and FPGAs. Experiments show that our waveSZ can improve SZ’s compression throughput by 6.9× ∼ 8.7× over the production version running on a state-of-the-art CPU and improve the compression ratio and throughput by 2.1× and 5.8× on average, respectively, compared with the state-of-the-art FPGA design.

Original languageEnglish
Title of host publicationPPoPP 2020 - Proceedings of the 2020 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Pages74-88
Number of pages15
ISBN (Electronic)9781450368186
DOIs
StatePublished - Feb 19 2020
Event25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2020 - San Diego, United States
Duration: Feb 22 2020Feb 26 2020

Publication series

NameProceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP

Conference

Conference25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2020
Country/TerritoryUnited States
CitySan Diego
Period2/22/202/26/20

Bibliographical note

Publisher Copyright:
© 2020 Association for Computing Machinery.

Keywords

  • Compression Ratio
  • FPGA
  • Lossy Compression
  • Scientific Data
  • Software-Hardware Co-Design
  • Throughput

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Wavesz: A hardware-algorithm co-design of efficient lossy compression for scientific data'. Together they form a unique fingerprint.

Cite this