Error-controlled Progressive Retrieval of Scientific Data under Derivable Quantities of Interest

Xuan Wu, Qian Gong, Jieyang Chen, Qing Liu, Norbert Podhorszki, Xin Liang, Scott Klasky

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The unprecedented amount of scientific data has introduced heavy pressure on the current data storage and transmission systems. Progressive compression has been proposed to mitigate this problem, which offers data access with on-demand precision. However, existing approaches only consider precision control on primary data, leaving uncertainties on the quantities of interest (QoIs) derived from it. In this work, we present a progressive data retrieval framework with guaranteed error control on derivable QoIs. Our contributions are three-fold. (1) We carefully derive the theories to strictly control QoI errors during progressive retrieval. Our theory is generic and can be applied to any QoIs that can be composited by the basis of derivable QoIs proved in the paper. (2) We design and develop a generic progressive retrieval framework based on the proposed theories, and optimize it by exploring feasible progressive representations. (3) We evaluate our framework using five real-world datasets with a diverse set of QoIs. Experiments demonstrate that our framework can faithfully respect any user-specified QoI error bounds in the evaluated applications. This leads to over 2.02 × performance gain in data transfer tasks compared to transferring the primary data while guaranteeing a QoI error that is less than 1E-5.

Original languageEnglish
Title of host publicationProceedings of SC 2024
Subtitle of host publicationInternational Conference for High Performance Computing, Networking, Storage and Analysis
ISBN (Electronic)9798350352917
DOIs
StatePublished - 2024
Event2024 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2024 - Atlanta, United States
Duration: Nov 17 2024Nov 22 2024

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Conference

Conference2024 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2024
Country/TerritoryUnited States
CityAtlanta
Period11/17/2411/22/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Funding

This research was supported by the Exascale Computing Project CODAR, SIRIUS-2 ASCR research project, the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory (ORNL), and the Scientific Discovery through Advanced Computing (SciDAC) program, specifically the RAPIDS-2 SciDAC institute. It was also supported by the National Science Foundation under Grant OAC-2330367, OAC-2311756, OAC-2311757, OAC-2313122, and OIA-2327266. We would like to thank the University of Kentucky Center for Computational Sciences and Information Technology Services Research Computing for its support and use of the Lipscomb Compute Cluster, Morgan Compute Cluster, and associated research computing resources.

FundersFunder number
Advanced Scientific Computing Research
Laboratory Directed Research and Development Program of ORNL
Oak Ridge National Laboratory
Kentucky Transportation Center, University of Kentucky
U.S. Department of Energy Chinese Academy of Sciences Guangzhou Municipal Science and Technology Project Oak Ridge National Laboratory Extreme Science and Engineering Discovery Environment National Science Foundation National Energy Research Scientific Computing Center National Natural Science Foundation of ChinaOAC-2330367, OAC-2311757, OAC-2311756, OIA-2327266, OAC-2313122
U.S. Department of Energy Chinese Academy of Sciences Guangzhou Municipal Science and Technology Project Oak Ridge National Laboratory Extreme Science and Engineering Discovery Environment National Science Foundation National Energy Research Scientific Computing Center National Natural Science Foundation of China

    Keywords

    • data compression
    • error control
    • High-performance computing
    • progressive retrieval
    • scientific data

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Computer Science Applications
    • Hardware and Architecture
    • Software

    Fingerprint

    Dive into the research topics of 'Error-controlled Progressive Retrieval of Scientific Data under Derivable Quantities of Interest'. Together they form a unique fingerprint.

    Cite this