Controlling Memory Access Concurrency in Efficient Fault-Tolerant Parallel Algorithms,
BROWN UNIV PROVIDENCE RI DEPT OF COMPUTER SCIENCE
Pagination or Media Count:
The CRCW PRAM under dynamic fail-stop no restart processor behavior is a fault-prone multiprocessor model for which it is possible to both guarantee reliability and preserve efficiency. To handle dynamic faults some redundancy is necessary in the form of many processors concurrently performing a common read or write task. In this paper we show how to significantly decrease this concurrency by bounding it in terms of the number of actual processor faults. We describe a low concurrency, efficient and fault-tolerant algorithm for the Write-All primitive using less than or equal to N processors, write 1s into N locations. This primitive can serve as the basis for efficient fault-tolerant simulations of algorithms written for fault-free PRAMs on fault-prone PRAMs. For any dynamic failure pattern F, our algorithm has total write concurrency less than or equal to F and total read concurrency less than or equal to 7Flog N, where F is the number of processor faults for example, there is no concurrency in a run without failures note that, previous algorithms used OmegaN log N concurrency even in the absence of faults. We also describe a technique for limiting the per step concurrency and present an optimal fault- tolerant EREW PRAM algorithm for Write-All, when all processor faults are initial.
- Numerical Mathematics
- Computer Programming and Software