Resilient error-bounded lossy compressor for data transfer

Sihuan Li, Sheng Di, Kai Zhao, Xin Liang, Zizhong Chen, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations


Todays exa-scale scientific applications or advanced instruments are producing vast volumes of data, which need to be shared/transferred through the network/devices with relatively low bandwidth (e.g., data sharing on WAN or transferring from edge devices to supercomputers). Lossy compression is one of the candidate strategies to address the big data issue. However, little work was done to make it resilient against silent errors, which may happen during the stage of compression or data transferring. In this paper, we propose a resilient error-bounded lossy compressor based on the SZ compression framework. Specifically, we design a new independentblock-wise model that decomposes the entire dataset into many independent sub-blocks to compress then, we design and implement a series of error detection/correction strategies elaboratively for each stage of SZ. Our method is arguably the first algorithmbased fault tolerance (ABFT) solution for lossy compression. Our proposed solution incurs negligible execution overhead in the faultfree situation. Upon soft errors happening, it ensures decompressed data strictly bounded within users requirement with a very limited degradation of compression ratio and low overhead.

Original languageEnglish
Title of host publicationProceedings of SC 2021
Subtitle of host publicationThe International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond
ISBN (Electronic)9781450384421
StatePublished - Nov 14 2021
Event33rd International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond, SC 2021 - Virtual, Online, United States
Duration: Nov 14 2021Nov 19 2021

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337


Conference33rd International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond, SC 2021
Country/TerritoryUnited States
CityVirtual, Online

Bibliographical note

Funding Information:
This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations – the Office of Science and the National Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, to support the nation’s exascale computing imperative. The material was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract DE-AC02-06CH11357, and supported by the National Science Foundation under Grant No. 1617488, Grant No. 1619253 and Grant No. 2003709. We acknowledge the computing resources provided on Bebop, which is operated by the Laboratory Computing Resource Center at Argonne National Laboratory.

Publisher Copyright:
© 2021 IEEE Computer Society. All rights reserved.


  • Algorithm Based Fault Tolerance
  • Data transfer
  • Lossy compression

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Software


Dive into the research topics of 'Resilient error-bounded lossy compressor for data transfer'. Together they form a unique fingerprint.

Cite this