Compiler Assisted Recovery For Fault-Tolerant Highly Parallel Multiprocessor Architectures
ILLINOIS UNIV AT URBANA COORDINATED SCIENCE LAB
Pagination or Media Count:
The purpose of this research was to develop and implement compiler assisted strategies for recovery through multiple instruction reexecution rollback in highly parallel computer architectures utilizing hierarchical shared memories. The goal was to facilitate very rapid recovery from high rates of transient and intermittent failures in SDI environments. We worked to achieve this goal with minimal impact on system performance and little hardware overhead by exploiting the hardware features already present in recently developed high performance processor architectures. Our objective was to demonstrate that through appropriate compilation techniques these hardware features can be utilized to perform rapid recovery, without significant architecture redesign. Our research effort concentrated on multiprocessor machines with hierarchical memory structures, due to the architectural trend toward hierarchical memory, shared variable, multiprocessor architectures and due to the current lack of understanding as to how rapid recovery can be accomplished in this class of machines.
- Computer Hardware