Report Number: CS-TR-89-1286
Institution: Stanford University, Department of Computer Science
Title: Fast sparse matrix factorization on modern workstations
Author: Rothberg, Edward
Author: Gupta, Anoop
Date: October 1989
Abstract: The performance of workstation-class machines has experienced
a dramatic increase in the recent past. Relatively
inexpensive machines which offer 14 MIPS and 2 MFLOPS
performance are now available, and machines with even higher
performance are not far off. One important characteristic of
these machines is that they rely on a small amount of
high-speed cache memory for their high performance. In this
paper, we consider the problem of Cholesky factorization of a
large sparse positive definite system of equations on a high
performance workstation. We find that the major factor
limiting performance is the cost of moving data between
memory and the processor. We use two techniques to address
this limitation; we decrease the number of memory references
and we improve cache behavior to decrease the cost of each
reference. When run on benchmarks from the Harwell-Boeing
Sparse Matrix Collection, the resulting factorization code is
almost three times as fast as SPARSPAK on a DECStation 3100.
We believe that the issues brought up in this paper will play
an important role in the effective use of high performance
workstations on large numerical problems.
http://i.stanford.edu/pub/cstr/reports/cs/tr/89/1286/CS-TR-89-1286.pdf