Report Number: CS-TR-89-1286
Institution: Stanford University, Department of Computer Science
Title: Fast sparse matrix factorization on modern workstations
Author: Rothberg, Edward
Author: Gupta, Anoop
Date: October 1989
Abstract: The performance of workstation-class machines has experienced a dramatic increase in the recent past. Relatively inexpensive machines which offer 14 MIPS and 2 MFLOPS performance are now available, and machines with even higher performance are not far off. One important characteristic of these machines is that they rely on a small amount of high-speed cache memory for their high performance. In this paper, we consider the problem of Cholesky factorization of a large sparse positive definite system of equations on a high performance workstation. We find that the major factor limiting performance is the cost of moving data between memory and the processor. We use two techniques to address this limitation; we decrease the number of memory references and we improve cache behavior to decrease the cost of each reference. When run on benchmarks from the Harwell-Boeing Sparse Matrix Collection, the resulting factorization code is almost three times as fast as SPARSPAK on a DECStation 3100. We believe that the issues brought up in this paper will play an important role in the effective use of high performance workstations on large numerical problems.
http://i.stanford.edu/pub/cstr/reports/cs/tr/89/1286/CS-TR-89-1286.pdf