An asynchronous recovery algorithm based on a staggered quasi-synchronous checkpointing algorithm

D. Manivannan, Q. Jiang, J. Yang, K. E. Persson, M. Singhal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Checkpointing and rollback recovery are established techniques for handling failures in distributed systems. Under synchronous checkpointing, each process involved in the distributed computation takes checkpoint almost simultaneously. This causes contention for network stable storage and hence degrades performance. To overcome this problem, checkpoint staggering under which checkpoints by various processes are taken in a staggered manner, has been proposed. In this paper, we propose a staggered quasi-synchronous checkpointing algorithm which reduces contention for network stable storage without any synchronization overhead. We also present an asynchronous recovery algorithm based on the checkpointing algorithm.

Original languageEnglish
Title of host publicationDistributed Computing - IWDC 2005 - 7th International Workshop, Proceedings
Pages117-128
Number of pages12
DOIs
StatePublished - 2005
Event7th International Workshop on Distributed Computing, IWDC 2005 - Kharagpur, India
Duration: Dec 27 2005Dec 30 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3741 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th International Workshop on Distributed Computing, IWDC 2005
Country/TerritoryIndia
CityKharagpur
Period12/27/0512/30/05

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'An asynchronous recovery algorithm based on a staggered quasi-synchronous checkpointing algorithm'. Together they form a unique fingerprint.

Cite this