Abstract
For long-running or large-scale distributed programs, the ability to provide software fault-tolerance via checkpointing is valuable. For scalable systems, multicast communication is becoming a predominant communication paradigm. While some aspects of consistency and channel state are the same for both unicast and multicast protocols, the implementation of checkpointing systems differ. This paper explores the problem of checkpointing in a multicast environment and introduces two checkpointing algorithms for such environments. The first algorithm is closely based on existing checkpointing algorithms. The second employs the multicast protocol to distribute checkpointing information efficiently.
Original language | English |
---|---|
Title of host publication | 1998 IEEE Aerospace Conference, AERO 1998 - Proceedings |
Pages | 467-479 |
Number of pages | 13 |
DOIs | |
State | Published - 1998 |
Event | 1998 IEEE Aerospace Conference, AERO 1998 - Snowmass, United States Duration: Mar 28 1998 → Mar 28 1998 |
Publication series
Name | IEEE Aerospace Conference Proceedings |
---|---|
Volume | 4 |
ISSN (Print) | 1095-323X |
Conference
Conference | 1998 IEEE Aerospace Conference, AERO 1998 |
---|---|
Country/Territory | United States |
City | Snowmass |
Period | 3/28/98 → 3/28/98 |
Bibliographical note
Publisher Copyright:© 1998 IEEE.
ASJC Scopus subject areas
- Aerospace Engineering
- Space and Planetary Science