DID YOU KNOW? DTIC has over 3.5 million final reports on DoD funded research, development, test, and evaluation activities available to our registered users. Click HERE
to register or log in.
Fast Sparse Matrix Factorization on Modern Workstations
STANFORD UNIV CA DEPT OF COMPUTER SCIENCE
Pagination or Media Count:
The performance of workstation-class machines has experienced a dramatic increase in the recent past. Relatively inexpensive machines which offer 14 MIPS and 2 MFLOPS performance are now available, and machines with even higher performance are not far off. One important characteristic of these machines is that they rely on a small amount of high-speed cache memory for their high performance. In this paper, we consider the problem of cholesky factorization of a large sparse positive definite system of equations on a high performance workstation. We find that the major factor limiting performance is the cost of moving data between memory and the processor. We use two techniques to address this limitation we decrease the number of memory references and we improve cache behavior to decrease the cost of each reference. When run on benchmarks from the Harwell-Boeing Sparse Matrix Collection, the resulting factorization code is almost three times as fast as SPARSPAK on a DECStation 3100. We believe that the issues brought up in this paper will play an important role in the effective use of high performance workstations on large numerical problems.
APPROVED FOR PUBLIC RELEASE