Improving Lossy Compression for SZ by Exploring the Best-Fit Lossless Compression Techniques

Jinyang Liu, Sihuan Li, Sheng Di, Xin Liang, Kai Zhao, Dingwen Tao, Zizhong Chen, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

In the past decades, various lossy compressors have been studied broadly due to the ever-increasing volume of data being produced by today's scientific applications. SZ has been one of the best error-bounded lossy compressors ever raised, and it has a flexible framework that includes four adjustable steps: prediction, quantization, variable-length encoding, and lossless compression. In this paper, we improve the lossy compression performances of the SZ compression model by exploring different existing lossless compression techniques using the Squash data compression benchmark. Specifically, we first characterize the bytes outputted by the first three steps in SZ, then we investigate the best lossless compressor with different datasets and different error bounds. We perform our exploration by testing 8 widely used lossless compressors under different configurations together with SZ over five well-known scientific simulation datasets. Our experiments show that adopting the best-fit lossless compressor selected based on our analysis can improve the overall compression speed by up to 40% compared to the previous lossless compression technique used in SZ with the comparable quality of reconstructed data.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
EditorsYixin Chen, Heiko Ludwig, Yicheng Tu, Usama Fayyad, Xingquan Zhu, Xiaohua Tony Hu, Suren Byna, Xiong Liu, Jianping Zhang, Shirui Pan, Vagelis Papalexakis, Jianwu Wang, Alfredo Cuzzocrea, Carlos Ordonez
Pages2986-2991
Number of pages6
ISBN (Electronic)9781665439022
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Big Data, Big Data 2021 - Virtual, Online, United States
Duration: Dec 15 2021Dec 18 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021

Conference

Conference2021 IEEE International Conference on Big Data, Big Data 2021
Country/TerritoryUnited States
CityVirtual, Online
Period12/15/2112/18/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Funding

This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations – the Office of Science and the National Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, to support the nation’s exascale computing imperative. The material was supported by the U.S. Department of Energy (DOE), Office of Science and DOE Advanced Scientific Computing Research (ASCR) office, under contract DE-AC02-06CH11357, and supported by the National Science Foundation under Grant OAC-2003709, OAC-2003624/2042084, SHF-1617488, and OAC-2104023. We acknowledge the computing resources provided on Bebop, which is operated by the Laboratory Computing Resource Center at Argonne National Laboratory.

FundersFunder number
National Science Foundation (NSF)OAC-2003709, SHF-1617488, OAC-2003624/2042084, OAC-2104023
Michigan State University-U.S. Department of Energy (MSU-DOE) Plant Research Laboratory
Office of Science Programs
National Nuclear Security Administration
Advanced Scientific Computing ResearchDE-AC02-06CH11357

    ASJC Scopus subject areas

    • Information Systems and Management
    • Artificial Intelligence
    • Computer Vision and Pattern Recognition
    • Information Systems

    Fingerprint

    Dive into the research topics of 'Improving Lossy Compression for SZ by Exploring the Best-Fit Lossless Compression Techniques'. Together they form a unique fingerprint.

    Cite this