Accession Number:

ADA279896

Title:

Fault-Tolerance in Distributed and Multiprocessor Real-Time Systems

Descriptive Note:

Final rept. 1 Sep 1992-31 Aug 1993

Corporate Author:

TEXAS ENGINEERING EXPERIMENT STATION COLLEGE STATION

Personal Author(s):

Report Date:

1993-08-31

Pagination or Media Count:

41.0

Abstract:

New schemes for fault-tolerance in multiprocessor and distributed systems have been developed in the following areas We have investigated a number of fault tolerance schemes to evaluate performance, reliability, and availability trade-offs. Fault tolerance schemes are being developed for various fault models tail-stop model, fail-slow model, and arbitrary failure model and application areas applications that are to provide results at the end of computation and applications that are long-running but should also provide results during computation. In the area of software-implemented fault tolerance, we are studying approaches for providing user transparent mechanisms for fault tolerance to design and implement a software library to which the user can link existing application software to achieve the desired level of fault tolerance. We are developing a new tool Reliable Architecture Characterization Tool--REACT for evaluating the reliability and availability of distributed multiprocessor systems using various fault tolerance techniques. This tool will facilitate evaluation of the fault tolerance schemes that we develop.

Subject Categories:

  • Computer Programming and Software
  • Computer Systems Management and Standards

Distribution Statement:

APPROVED FOR PUBLIC RELEASE