Report Number: CSL-TR-97-729
Institution: Stanford University, Computer Systems Laboratory
Title: Remote Memory Access in Workstation Clusters
Author: Verghese, Ben
Author: Rosenblum, Mendel
Date: july 1997
Abstract: Efficient sharing of memory resources in a cluster of workstations has the promise of greatly improving the performance and cost-effectiveness of the cluster when running large memory- intensive jobs. A point of interest is the hardware support required for good memory sharing performance. We evaluate the performance of two models: the software-only model that runs on a traditional distributed system configuration, and requires support from the operating system to access remote memory; and the hardware-intensive model that uses a specialized network interface to extend the memory system to allow direct access to remote memory. Using SimOS, we do a fair comparison of the performance of the two memory-sharing models for a set of interesting compute-server workloads. We find that the software-only model, with current remote page-fault latencies, does not provide acceptable memory-sharing performance. The hardware shared-memory system is able to provide stable performance across a range of latencies. If the remote page-fault latency can be reduced to 100 microseconds, the performance of the software- only model becomes acceptable for many, though not all, workloads. Considering the interconnection bandwidth required to sustain the software-only page-level memory sharing, our experiments show that a gigabit network is necessary for good performance.