A Real-Time Systems Symposium Preprint.
Interim technical rept.
STANFORD UNIV CA CENTER FOR RELIABLE COMPUTING
Pagination or Media Count:
This paper describes the measurement and analysis of permanent CPU Central Processing Unit related errors and system activity at the Stanford Linear Accelerator Center computation facility. Between 13 and 18 percent of all errors affecting the CPU were estimated to be permanent. The manifestation of a permanent error was found to be strongly correlated with the level and type of workload prior to the manifestation of the error, for example, it is shown that the risk of a permanent error increases in a non-linear fashion with the amount of interactive processing. The observed tendency is present in three years of load data. This observation is significant because of load-error relationship found at the CPU level must, in our view, be considered fundamental. In addition, in a majority of the observed errors, the latency between the occurence and the manifestation of the error was estimated to be insignificant for the purposes of our analysis. Thus the detection of the error also provides an estimate of the occurence of the error. Author
- Numerical Mathematics
- Computer Hardware