Collaborative Research: OAC Core: Topology-Aware Data Compression for Scientific Analysis and Visualization

Grants and Contracts Details

Description

Today’s large-scale simulations are producing vast amounts of data that are revolutionizing scientific thinking and practices. As the disparity between data generation rates and available I/O bandwidths continues to grow, data storage and movement are becoming significant bottlenecks for extreme-scale scientific simulations in terms of in situ and post hoc analysis and visualization. Such a disparity necessitates data compression, where data produced by simulations are compressed in situ and decompressed in situ and post hoc for analysis and exploration. Meanwhile, topological data analysis plays an important role in extracting insights from scientific data regarding feature definition, extraction, and evaluation. However, most of today’s lossy compressors provide global error bounds on the decompressed data, which do not guarantee the preservation of topological features essential to scientific discoveries. This project aims to research and develop advanced lossy compression techniques and softwares that preserve topological features in data for in situ and post hoc analysis and visualization at extreme scales. The data of interest are scalar fields and vector fields that arise from scientific simulations, with driving applications in cosmology, climate, and fusion simulations. This project has three research thrusts that focus on deriving topological constraints from scalar fields (I) and vector fields (II), and integrating these constraints to develop topology-aware errorcontrolled and deep-learning based compressors (III). Topological descriptors for scalar and vector fields play a dual role for data compression: they provide topological constraints for error-controlled compressors in the form of pointwise error bounds and for deep-learning-based compressors in the form of topological loss functions. The team will work closely with domain scientists from climate, fusion, and cosmology research communities to make a significant impact on computational and data-enabled science and engineering. Intellectual Merit: This project tackles the data compression, analysis, and visualization needs in extreme-scale scientific simulations by developing a suite of topology-aware data reduction algorithms. Such algorithms e↵ectively reduce the size of data while preserving critical features defined by topological notations. We will demonstrate that topological features can be authentically preserved in decompressed data by defining and enforcing topology-aware constraints over advanced lossy compression algorithms. Such capabilities have not been studied systematically within today’s data compression paradigm, which is mostly topology-agnostic, and can lead to significant errors in analyzing and visualizing decompressed data using topological techniques. This project will impact specific fields (computational science, data analysis, data reduction, and visualization) and the broader scientific community. The software deliverable of this project will significantly enhance software infrastructure for upcoming exascale systems. This project will foster novel discoveries in multiple scientific disciplines beyond cosmology, climate, and fusion by enabling efficient and e↵ective compression on a wide range of platforms. Broader Impact: This project brings together application scientists, visualization experts, and compression researchers to advance computational and data-enabled science and engineering. The PIs will integrate the research results into teaching and recruit talented students to participate in collaborative research initiatives with leading domain scientists. The team will broaden the participation of underrepresented groups and K–12 students through ongoing collaborations with summer camps on university campuses. Workshops will be organized at visualization and high-performance computing conferences for broad dissemination. In particular, data challenges will be integrated within workshops to help onboard members from simulation and computational communities to engage in joint developmental e↵orts. Keywords: Data visualization, data reduction, topological data analysis, large-scale simulations 1
StatusActive
Effective start/end date9/1/238/31/26

Funding

  • National Science Foundation: $201,765.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.