Dynamic Quality Metric Oriented Error Bounded Lossy Compression for Scientific Datasets

Jinyang Liu, Sheng Di, Kai Zhao, Xin Liang, Zizhong Chen, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

With ever-increasing execution scale of the high performance computing (HPC) applications, vast amount of data are being produced by scientific research every day. Error-bounded lossy compression has been considered a very promising solution to address the big-data issue for scientific applications, because it can significantly reduce the data volume with low time cost meanwhile allowing users to control the compression errors with a specified error bound. The existing error-bounded lossy compressors, however, are all developed based on inflexible designs or compression pipelines, which cannot adapt to diverse compression quality requirements/metrics favored by different application users. In this paper, we propose a novel dynamic quality metric oriented error-bounded lossy compression frame-work, namely QoZ. The detailed contribution is three fold. (1) We design a novel highly-parameterized multi-level interpolation-based data predictor, which can significantly improve the overall compression quality with the same compressed size. (2) We design the error bounded lossy compression framework QoZ based on the adaptive predictor, which can auto-tune the critical parameters and optimize the compression result according to user-specified quality metrics during online compression. (3) We evaluate QoZ carefully by comparing its compression quality with multiple state-of-the-arts on various real-world scientific application datasets. Experiments show that, compared with the second best lossy compressor, QoZ can achieve up to 70% compression ratio improvement under the same error bound, up to 150% compression ratio improvement under the same PSNR, or up to 270% compression ratio improvement under the same SSIM.

Original languageEnglish
Title of host publicationProceedings of SC 2022
Subtitle of host publicationInternational Conference for High Performance Computing, Networking, Storage and Analysis
ISBN (Electronic)9781665454445
DOIs
StatePublished - 2022
Event2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022 - Dallas, United States
Duration: Nov 13 2022Nov 18 2022

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
Volume2022-November
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Conference

Conference2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022
Country/TerritoryUnited States
CityDallas
Period11/13/2211/18/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Funding

ACKNOWLEDGMENTS This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations – the Office of Science and the National Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, to support the nation’s exascale computing imperative. The material was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (ASCR), under contract DE-AC02-06CH11357, and supported by the National Science Foundation under Grant OAC-2003709, OAC-2104023 and OAC-2153451. We acknowledge the computing resources provided on Bebop (operated by Laboratory Computing Resource Center at Argonne) and on Theta and JLSE (operated by Argonne Leadership Computing Facility).

FundersFunder number
National Science Foundation (NSF)OAC-2003709, OAC-2153451, OAC-2104023
Michigan State University-U.S. Department of Energy (MSU-DOE) Plant Research Laboratory
Office of Science Programs
National Nuclear Security Administration
Advanced Scientific Computing ResearchDE-AC02-06CH11357

    Keywords

    • error-bounded lossy compression
    • interpolation
    • quality metrics
    • scientific datasets

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Computer Science Applications
    • Hardware and Architecture
    • Software

    Fingerprint

    Dive into the research topics of 'Dynamic Quality Metric Oriented Error Bounded Lossy Compression for Scientific Datasets'. Together they form a unique fingerprint.

    Cite this