Accession Number:

ADA250303

Title:

Fault-Tolerant Wait-Free Shared Objects

Descriptive Note:

Technical rept.

Corporate Author:

CORNELL UNIV ITHACA NY DEPT OF COMPUTER SCIENCE

Report Date:

1992-04-01

Pagination or Media Count:

35.0

Abstract:

A concurrent system consists of processes communicating via shared objects, such as shared variables, queues, etc. The concept of wait-freedom was introduced to cope with process failures each process that accesses a wait-free object is guaranteed to get a response even if all the other processes crash. But what if these wait-free objects themselves fail For example, if a wait-free object crashes , all the processes that access that object are prevented from making progress. In this paper, we introduce the concept of fault-tolerant wait-free objects, and study the problem of implementing them. We give a universal method to construct fault-tolerant wait-free objects, for all types of responsive failures including one in which faulty objects may lie . In sharp contrast, we prove that many common and interesting object types such as queues, sets, and test and set have no fault-tolerant wait-free implementations even under the most benign of the non-responsive types of failure. We also introduce several concepts and techniques that are central to the design of fault-tolerant concurrent systems the concepts of self-implementation and graceful degradation, and techniques to automatically increase the fault-tolerance of implementations. We prove matching lower bounds on the resource complexity of most of our algorithms.

Subject Categories:

  • Computer Hardware

Distribution Statement:

APPROVED FOR PUBLIC RELEASE