Accession Number : ADA259257


Title :   Lazy Checkpoint Coordination for Bounding Rollback Propagation


Corporate Author : ILLINOIS UNIV AT URBANA COORDINATED SCIENCE LAB


Personal Author(s) : Wang, Yi-Min ; Fuchs, W K


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a259257.pdf


Report Date : 28 May 1993


Pagination or Media Count : 22


Abstract : Independent checkpointing allows maximum process autonomy but suffers from potential domino effects. Coordinated checkpointing eliminates the domino effect by sacrificing a certain degree of process autonomy. In this paper, we propose the technique of lazy checkpoint coordination which preserves process autonomy while employing communication-induced checkpoint coordination for bounding rollback propagation. The introduction of the notion of laziness allows a flexible tradeoff between the cost for checkpoint coordination and the average rollback distance. Worst-case overhead analysis provides a means for estimating the extra checkpoint overhead. Communication trace-driven simulation for several parallel programs is used to evaluate the benefits of the proposed scheme for real applications.... Fault tolerance, Independent checkpointing, Checkpoint coordination, Rollback recovery.


Descriptors :   *COMPUTER COMMUNICATIONS , *PARALLEL PROCESSING , *MESSAGE PROCESSING , COSTS , BENEFITS , FAULT TOLERANCE


Subject Categories : Computer Systems


Distribution Statement : APPROVED FOR PUBLIC RELEASE