TY - GEN
T1 - An asynchronous recovery algorithm based on a staggered quasi-synchronous checkpointing algorithm
AU - Manivannan, D.
AU - Jiang, Q.
AU - Yang, J.
AU - Persson, K. E.
AU - Singhal, M.
PY - 2005
Y1 - 2005
N2 - Checkpointing and rollback recovery are established techniques for handling failures in distributed systems. Under synchronous checkpointing, each process involved in the distributed computation takes checkpoint almost simultaneously. This causes contention for network stable storage and hence degrades performance. To overcome this problem, checkpoint staggering under which checkpoints by various processes are taken in a staggered manner, has been proposed. In this paper, we propose a staggered quasi-synchronous checkpointing algorithm which reduces contention for network stable storage without any synchronization overhead. We also present an asynchronous recovery algorithm based on the checkpointing algorithm.
AB - Checkpointing and rollback recovery are established techniques for handling failures in distributed systems. Under synchronous checkpointing, each process involved in the distributed computation takes checkpoint almost simultaneously. This causes contention for network stable storage and hence degrades performance. To overcome this problem, checkpoint staggering under which checkpoints by various processes are taken in a staggered manner, has been proposed. In this paper, we propose a staggered quasi-synchronous checkpointing algorithm which reduces contention for network stable storage without any synchronization overhead. We also present an asynchronous recovery algorithm based on the checkpointing algorithm.
UR - http://www.scopus.com/inward/record.url?scp=33745297327&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745297327&partnerID=8YFLogxK
U2 - 10.1007/11603771_14
DO - 10.1007/11603771_14
M3 - Conference contribution
AN - SCOPUS:33745297327
SN - 3540309594
SN - 9783540309598
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 117
EP - 128
BT - Distributed Computing - IWDC 2005 - 7th International Workshop, Proceedings
T2 - 7th International Workshop on Distributed Computing, IWDC 2005
Y2 - 27 December 2005 through 30 December 2005
ER -