Report Number: CSL-TR-81-214
Institution: Stanford University, Computer Systems Laboratory
Title: An exponential failure/load relationship: results of a multi-computer statistical study
Author: Iyer, Ravishankar K.
Author: Butner, Steven E.
Author: McCluskey, Edward J.
Date: July 1981
Abstract: In this paper we present an exponential statistical model
which relates computer failure rates to level of system
activity. Our analysis reveals a strong statistical
dependency of both hardware and software component failure
rates on several common measures of utilization (specifically
CPU utilization, I/O initiation, paging, and job-step
initiation rates). We establish that this effect is not
dominated by a specific component type, but exists across the
board in the two systems studied. Our data covers three years
of normal operation (including significant upgrades and
reconfigurations) for two large Stanford University computer
complexes. The complexes, which are composed of IBM mainframe
equipment of differing models and vintage, run similar
operating systems and provide the same interface and
capability to their users. The empirical data domes from
identically-structured and maintained failure logs at the two
sites along with IBM OS/VS2 operating system performance/load
records The statistically strong relationship between
failures and load is evident for many equipment types,
including electronic, mechanical, as well as software
components. This is in opposition to the commonly-held belief
that systems which are primarily electronic in nature exhibit
no such effect to any significant degree. The exponential
character of our statistical model is significantly not only
in its simplicity, but also due to its compatibility with
classical reliability techniques.
http://i.stanford.edu/pub/cstr/reports/csl/tr/81/214/CSL-TR-81-214.pdf