Finding consistent global checkpoints in a distributed computation

D. Manivannan, Robert H.B. Netzer, Mukesh Singhal

Research output: Contribution to journalArticlepeer-review

51 Scopus citations

Abstract

Consistent global checkpoints have many uses in distributed computations. A central question in applications that use consistent global checkpoints is to determine whether a consistent global checkpoint that includes a given set of local checkpoints can exist. Netzer and Xu [16] presented the necessary and sufficient conditions under which such a consistent global checkpoint can exist, but they did not explore what checkpoints could be constructed. In this paper, we prove exactly which local checkpoints can be used for constructing such consistent global checkpoints. We illustrate the use of our results with a simple and elegant algorithm to enumerate all such consistent global checkpoints.

Original languageEnglish
Pages (from-to)623-627
Number of pages5
JournalIEEE Transactions on Parallel and Distributed Systems
Volume8
Issue number6
DOIs
StatePublished - 1997

Keywords

  • Causality
  • Consistent global states
  • Distributed checkpointing
  • Failure recovery
  • Fault tolerance

ASJC Scopus subject areas

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Finding consistent global checkpoints in a distributed computation'. Together they form a unique fingerprint.

Cite this