Institution: Stanford University, Computer Systems Laboratory

Title: On the computational complexity of finite functions

Author: Spira, Philip M.

Date: May 1968

Abstract: One of the most rapidly expanding fields of applied mathematics and engineering is automata theory. Although the term "automaton" is derived from "self-moving thing," the prime concern of automata theory is the study of information-processing devices. A specific example of information processing is computation, and thus the mathematical properties of devices which perform computations are of interest to automata theorists. In this thesis we investigate the computation by logic circuits of a certain class of functions having finite domain. To a given function f a number of so-called complexity criteria can be assigned relative to that class, e.g., the minimum computation time of or the minimum number of elements contained in any circuit of the class which is capable of computing f . Our prime criterion of interest will be computation time. The type of circuits investigated in this thesis are called (d,r ) circuits. A (d,r ) circuit is composed of logical elements each having at most r inputs and one output. Each input value and output value is an element from the set Z d = {0,1,...,d - 1}, and each element has unit delay in computing its output. Thus a given element computes a function from Z S(k,d) to Z d , for some k 2 r, in unit time. The output of one element can be connected to inputs of any number of elements (including itself) and can also comprise one of the outputs of the circuit and an element receives a given one of its inputs either from the output of some element or from the inputs to the circuit. When individual elements are interconnected to form a (d,r) circuit, we can associate a computation time with the entire circuit. Specifically, let f : X1,...,Xn . Y be any function on finite sets X1,...,Xn. Let C be a (d,r) circuit whose input lines are partitioned into n sets. Let IC,j be the set of configurations of values from Z d on the jth (J = 1,2,...,n) and let OC be the set of output configurations of the circuit. Then C is said to compute f in time t if there are maps gj : Xj . IC,j (j = 1,2,...,n) and a 1 - 1 function h : Y . OC such that, if the input from time 0 through time t - 1 is [g1(x1),...,gn(xn)], then the output of C at time t will be h(f(x1,...,xn)). Winograd has done pioneering work on the time of computation of finite functions by (d,r) circuits. He has derived lower bounds on computation time and has constructed near optimal circuits for many classes of finite functions. A principal contribution of this thesis is a complete determination of the time necessary to compute multiplication in a finite group with a (d,r) circuit. A new group theoretic quantity d(G) is defined whose reciprocal is the proper generalization of Winograd's a(G) to nonabelian groups. Then a novel method of circuit synthesis for group multiplication is given. In contrast to previous procedures, it is valid for any finite group--abelian or not. It is completely algebraic in character and is based upon our result that any finite group has a family of subgroups having a trivial intersection and minimum order d(G). The computation time achieved is, in all cases, at most one unit greater than our lower bound. In particular, if G is abelian our computation time is never greater--and often considerably less--than Winograd's. We then generalize the group multiplication procedure to a method to compute any finite function. For given sets X1, X2 and Y and any family of subsets of Y having a certain property called completeness, a corresponding hierarchy of functions having domain X1 x X2 and range Y is established -- the position of a function depending upon its computation time with our method. For reasons which we explain in the test, this appears to be a very natural classification criterion. At the bottom of the hierarchy are invertible functions such as numerical addition and multiplication, and the position of a function in the hierarchy depends essentially upon how far it is from being invertible. For large |X1| and |X2| almost all functions are near the top, corresponding to the fact that nearly all f : X1 x X2 . Y require computation time equal to the maximum required for any such function. The new method is then applied to the case of finite semigroup multiplication.

CSL-TR-68-1

Report Number: CSL-TR-70-11

Institution: Stanford University, Computer Systems Laboratory

Title: The SPOOF: a new technique for analyzing the effects of faults on logic networks

Author: Clegg, Frederick W.

Date: August 1970

Abstract: In general, one cannot predict the effects of possible failures on the functional characteristics of a logic network without knowledge of the structure of that network. The SPOOF or structure-and parity-observing output function described in this report provides a new and convenient means of characterizing both network structure and output function in a single algebraic expression. A straightforward method for the determination of a SPOOF for any logic network is demonstrated. Similarities between SPOOF's and other means of characterizing network structure are discussed. Examples are presented which illustrate the ease with which the effects of any "stuck-at" fault - single or multiple - on the functional characteristics of a logic network are determined using SPOOF's.

CSL-TR-70-11

Report Number: CSL-TR-71-15

Institution: Stanford University, Computer Systems Laboratory

Title: Fault equivalence in sequential machines

Author: Boute, Raymond

Author: McCluskey, Edward J.

Date: June 1971

Abstract: This paper is concerned with the relationships among faults as they affect sequential machine behavior. Of particular interest are equivalence and dominancy relations. It is shown that for output faults (i.e., faults that do not affect state behavior), fault equivalence is related to the existence of an automorphism of the state table. For the same class of faults, the relation between dominance and equivalence is considered and some properties are pointed out. Another class of possible faults is also considered, namely, memory faults (i.e., faults in the logic feedback lines). These clearly affect the state behavior of the machine, and their influence on machine properties, such as being strongly connected, is discussed. It is proven that there exist classes of machines for which this property of being strongly connected is destroyed by every possible single fault. Further results on both memory and output faults are also presented.

CSL-TR-71-15

Report Number: CSL-TR-71-24

Institution: Stanford University, Computer Systems Laboratory

Title: An improved reliability model for NMR

Author: Siewiorek, Daniel P.

Date: December 1971

Abstract: The classical reliability model for N-modular redundancy (NMR) assumes the network to be failed when a majority of modules which drive the same voter fail. It has long been known that this model is pessimistic since there are instances, termed compensating module failures, where a majority of the modules fail but the network is nonfailed. A different module reliability model based on lead reliability is proposed which has the classical NMR reliability model as a special case. It is shown that the standard procedure for altering the classical model to take compensating module failures into account may predict a network reliability which is too low in some cases and too high in others. It is also demonstrated that the improved model can increase the predicted mission time (the time the system is to operate at or above a given reliability) by 50% over the classical model prediction for a simple network.

CSL-TR-71-24

Report Number: CSL-TR-72-30

Institution: Stanford University, Computer Systems Laboratory

Title: Adaptive design methods for checking sequences

Author: Boute, Raymond T.

Date: July 1972

Abstract: The length of checking sequences for sequential machines can be considerably reduced if, instead of preset distinguishing sequences, one uses so-called distinguishing sets of sequences, which serve the same purpose, but are generally shorter. The design of such a set turns out to be equivalent to the design of an adaptive distinguishing experiment,* though a checking sequence, using a distinguishing set, remains essentially preset. This property also explains the title. All machines having preset distinguishing sequences also have distinguishing sets. In case no preset distinguishing sequences exist, most of the earlier methods call for the use of locating sequences, which result in long checking experiments. However, in many of these cases, a distinguishing set can be found, thus resulting in even more savings in length. Finally, the characterizing sequences used in locating sequences can also be adaptively designed, and thus the basic idea presented below is advantageous even when no distinguishing sets exist. By "experiment" we mean the application of sequence(s) to the machine while observing the output. In some instances, the words "experiment" and "sequence" can be used interchangeably.

CSL-TR-72-30

Report Number: CSL-TR-72-35

Institution: Stanford University, Computer Systems Laboratory

Title: Separate non-homomorphic checking codes for binary addition

Author: Kolupaev, Stephen G.

Date: July 1972

Abstract: In this paper, necessary and sufficient conditions for successful detection of errors in a binary adder by any separate code are developed. We demonstrate the existence of separate checking codes for addition modulo $2^n$ (n >= 4) and modulo $2^n$-1 (n > 5, n even), which are not homomorphic images of the addition being checked. A non-homomorphic code is constructed in a regular fashion from a single check symbol with special properties. Finding all such intial check symbols requires an exhaustive search of a large tree, and results indicate that the number of distinct codes for a particular modulus grows rapidly with n. In an appendix, we examine a modulo $2^n$ adder where the carry out of the high position is also presented to a checker.

CSL-TR-72-35

Report Number: CSL-TR-72-36

Institution: Stanford University, Computer Systems Laboratory

Title: Design of a parallel encoder/decoder for the Hamming code, using ROM

Author: Mitarai, H.

Author: McCluskey, E. J.

Date: June 1972

Abstract: ROM implementation of logic circuits which have a large number of inputs in generally considered unwise. However, in the design of an encoder/decoder for the Hamming code, ROM implementation is found to yield many advantages over SSI and MSI implementation. There is a one-to-one correspondence between the partition of H matrix into submatrices and the partition of the set of the inputs to the encoder into subsets of the inputs to the ROM modules. Hence, several methods of partitioning the H matrix for the Hamming code are devised. The resulting ROM implementation is shown to save package count compared with other implementations. However, at the present state of technology, there is a trade-off between speed and package count. In the applications where speed is of the utmost importance, the SSI implementation using ECL logic is the most attractive. The disadvantage of ROM in speed should diminish in the near future when semiconductor memory technology will progress to the point where the slow DTL/TTL gates in the input buffer, the address decoder, and the output buffer of ROM, can be replaced by faster gates.

CSL-TR-72-36

Report Number: CSL-TR-73-49

Institution: Stanford University, Computer Systems Laboratory

Title: Self-testing residue trees

Author: Kolupaev, Stephen G.

Date: August 1973

Abstract: Error detection and correction in binary adders often require computing the residue modulo A of a binary number. We present here a totally self-checking network which extracts the residue of a binary input number of arbitrary width, with respect to any odd modulus A. This network has the tree structure commonly used for residue extraction: a binary tree of circuit blocks, where each block outputs the residue of its inputs. The network we describe differs from previous designs in that the signals between blocks of the tree are not binary-coded. Instead, the l-out-of-A code is used, where A is the modulus desired. Use of this code permits the network to be free of inverters, giving it an advantage in speed. The network output is also coded l-out-of-A, and with respect to this code, the residue tree is totally self-checking in the sense of Anderson. The residue tree described here requires logic gates with A inputs, when the modulus desired is A . This makes the basic design somewhat impractical for a large modulus, because gates with large fan-in are undesirable. To extend the usefulness of this network, we present a technique which uses several residue trees of this design, each for a different modulus. The outputs of these residue trees are combined by a totally self-checking translator from the code of multiple residues to the l-out-of-A code. Using this multiple residue scheme, the modulus of each residue tree can be made much smaller than the desired modulus A.

CSL-TR-73-49

Report Number: CSL-TR-73-52

Institution: Stanford University, Computer Systems Laboratory

Title: Hazards in asynchronous systems

Author: Chewning, D. R.

Author: Bredt, Thomas H.

Date: September 1972

Abstract: Necessary and sufficient conditions are given for the existence of static and dynamic hazards in combinational circuits that undergo multiple input changes. These theorems are applied in the analysis of modules, such as the wye module, that have been proposed for asynchronous systems. We show that unless internal module delays are strictly less than delays between modules, incorrect operation can occur due to hazards in module implementations.

CSL-TR-73-52

Report Number: CSL-TR-73-56

Institution: Stanford University, Computer Systems Laboratory

Title: Reliability modeling of NMR networks

Author: Abraham, Jacob

Author: Siewiorek, Daniel P.

Date: 1973

Abstract: A survey of the literature in the area of redundant system reliability modeling is presented with special emphasis on Triple Modular Redundancy (TMR). Areas where the classical method of TMR reliability prediction may prove inadequate are identified, like the interdependence of fault patterns at points of network fan-in and fan-out. This is especially true if the assumption of highly reliable subsystems, which is frequently made by the modeling techniques, is dropped. It is also not clear if the methods give an upper or a lower bound to the reliability. As a solution, a method of partitioning an arbitrary network into cells so that the faults in a cell are independent of faults in other cells is proposed. An algorithm is then given to calculate a tight lower bound on the reliability of any such cell, by considering only the structure of the interconnections within the cells. The value of reliability found is exact if TMR is assumed to be a coherent system. An approximation to the algorithm is also described; this can be used to find a lower bound to the reliability without extensive calculation. Modifications to the algorithm to improve it and to take care of special cases are given. Finally, the algorithm is extended to N-Modular Redundant (NMR) networks.

CSL-TR-73-56

Report Number: CSL-TR-73-62

Institution: Stanford University, Computer Systems Laboratory

Title: A highly efficient redundancy scheme: self-purging redundancy

Author: Losq, Jacques

Date: July 1975

Abstract: The goals of this paper are to present an efficient redundancy scheme for highly reliable systems, to give a method to compute the exact reliability of such schemes and to compare this scheme with other redundancy schemes. This redundancy scheme is self-purging redundancy; a scheme that uses a threshold voter and that purges the failed modules. Switches for self-purging systems are extremely simple: there is no replacement of failed modules and module purging is quite simply implemented. Because of switch simplicity, exact reliability calculations are possible. The effects of switch reliability are quantitatively examined. For short mission times, switch reliability is the most important factor: self-purging systems have a probability of failure several times larger than the figure obtained when switches are assumed to be perfect. The influence of the relative frequency of the diverse types of failures (permanent, intermittent, stuck-at,...) are also investigated. Reliability functions, mission time improvements and switch efficiency are displayed. Self-purging systems are compared with ot her redundant systems, like hybrid or NMR, for their relative merits in reliability gain, simplicity, cost and confidence in the reliability estimation. The high confidence in the reliability evaluation of self-purging systems makes them a standard for the validation of several models that have been proposed to take into account switch reliability. The accuracy of models using coverage factors can be evaluated that way.

CSL-TR-73-62

Report Number: CSL-TR-74-66

Institution: Stanford University, Computer Systems Laboratory

Title: Computer system performance measurement: instruction set processor level and microcode level

Author: Svobodova, Liba

Date: June 1974

Abstract: Techniques based on hardware monitoring were developed to measure computer system performance on the instruction set processor level and the microcode level. Knowledge of system behavior and system utilization at these two levels is extremely valuable for design of new processors. The reasons why such information is needed are discussed and applicable measurement techniques for obtaining necessary data are reviewed. A hardware monitor is a preferable measurement tool since it can trace most of the significant events attributed to these two levels without introducing any artifact. Described hardware monitoring techniques were implemented on the S/370 Model 145 at Stanford University. Measurements performed on the instruction set processor level were concerned with determining execution frequencies on individual instructions under normal system workload. The microcode level measurements measured the number and the type of S/370 Model 145 microwords executed in the process of interpretation of an individual S/370 instruction and the average execution time of each such instruction. Implementation of each technique is described and the results based on the outcome of performed measurements are presented. Finally, effectiveness and ease of use of the discussed techniques are considered.

CSL-TR-74-66

Report Number: CSL-TR-74-75

Institution: Stanford University, Computer Systems Laboratory

Title: Influence of fault-detection and switchinig mechanisms on the reliability of stand-by systems

Author: Losq, Jacques

Date: July 1975

Abstract: This paper concerns the reliability of stand-by systems when switch reliability is taken into account. It is assumed that failures obey a Poisson distribution for modules and switches. A very detailed method is given to model stand-by systems. Several cases are investigated: ideal systems, real systems with fault-detection mechanisms that can detect any module error and systems for which the fault-detection mechanisms detect only some of the module errors. The reliability versus time curves are determined for each value of the number of spares. It is shown that the best number of spares increases as the length of the mission increases. Systems with extremely short mission time have the best reliability when they have only one spare. The limit when the number of spares increases is the reliability obtained with simplex systems. Whatever the number of spares is, the reliability of stand-by systems goes to zero as time goes to infinity. For a given mission time, it is possible to determine the best number of spares and the best possible reliability. For a given reliability, it is possible to compute the number of spares that gives the longest mission time. These models can be used to determine whether or not there exists a stand-by system that meets the requirements of a given reliability and a given mission time. If such stand-by system exists, its characteristics (minimum number of spares and reliability) can be derived.

CSL-TR-74-75

Report Number: CSL-TR-74-77

Institution: Stanford University, Computer Systems Laboratory

Title: Parallel solution methods for triangular linear systems of equations

Author: Orcutt, Samuel E.

Date: June 1974

Abstract: In this paper we consider developing parallel solution methods for triangular linear systems of equations. For a system of N equations in N unknowns the serial method requires O(N2) steps, and the straightforward parallel method requires steps and O(N) processors. In this paper we develop methods that require O(log N) time when used with O(N3) processors and O( R(N) log N) time when used with O(N2) processors. We also consider solutions to band triangular systems and develop a method that requires O((log N) (log m)) time and O(Nm2) processors, where m is the bandwidth of the system.

CSL-TR-74-77

Report Number: CSL-TR-74-85

Institution: Stanford University, Computer Systems Laboratory

Title: The solution of large multi-dimensional Poisson problems

Author: Stone, Harold S.

Date: May 1974

Abstract: The Buneman algorithm for solving Poisson problems can be adapted to solve large Poisson problems on computers with a rotating drum memory so that the computation is done with very little time lost due to rotational latency of the drum .

CSL-TR-74-85

Report Number: CSL-TR-75-92

Institution: Stanford University, Computer Systems Laboratory

Title: The complexity of control structures and program validation

Author: Davison, Joseph W.

Date: May 1975

Abstract: A preliminary examination of the influence of control structures on the complexity of the proof of correctness of computer programs. A block structured proof technique is defined and studied. Two parameters affecting the complexity of the proof are defined; the number of exits from a block, and the cycle rank of a block, a measure of loop complexity. Proof complexity classes of flowcharts are defined, with maximum values for these parameters. The question investigated is: How does restricting the complexity affect the class of functions realizable, assuming a given set of primitive actions and predicates: It is found that loop complexity may be traded for exits, and that for a given number of exits there are functions requiring any specific loop complexity. Further, it is shown that blocks with two exits are considerably more powerful than those with only one. In fact, for a given maximal loop complexity, there are functions that cannot be realized with one-exit blocks, but can be realized with two-exit blocks, even if the loop complexity is restricted to essentially one internal loop per block. Looking at it the other way around, the addition of a second exit to a block allows construction of flowcharts with any specified loop complexity. This result appears to be extendable to blocks with more exits, but this has not been completed. The work is primarily of a graph theoretical nature, and may also be interpreted as an examination of sequential control structures from the point of view of feedback loop complexity.

CSL-TR-75-92

Report Number: CSL-TR-75-93

Institution: Stanford University, Computer Systems Laboratory

Title: Sequential circuit output probabilities from regular expressions

Author: Parker, Kenneth P.

Author: McCluskey, Edward J.

Date: June 1975

Abstract: This paper presents a number of methods for finding sequential circuit output probabilities using regular expressions. Various classes of regular expressions, based on their form, are defined and it is shown how to easily find multistep transition probabilities directly from the regular expressions. A new procedure for finding steady state probabilities is given which proceeds either from a regular expression or a state diagram description. This procedure is based on the concept of synchronization of the related machine, and is useful for those problems where synchronization sequences exist. In the cases where these techniques can be utilized, substantial savings in computation can be realized. Further, application to other areas such as multinomial Markov processes is immediate.

CSL-TR-75-93

Report Number: CSL-TR-75-95

Institution: Stanford University, Computer Systems Laboratory

Title: The stack working set: a characterization of spatial locality

Author: Rau, B. Ramakrishna

Date: July 1975

Abstract: Multilevel memory hierarchies are attractive from the point of view of cost-performance. However, they present far greater problems than two-level hierarchies when it comes to analytic performance evaluation. This may be attributed to two factors: firstly, the page size (or the unit of information transfer between two levels) varies with the level in the hierarchy; secondly, the request streams that the lower (slower) levels see are the fault streams out of the immediately higher levels. Therefore, the request stream seen by each level is not necessarily the same as the one generated by the processor. Since the performance depends directly upon the properties of the request stream, this poses a problem. A model for program behavior, which explicitly characterizes the spatial locality of the program, is proposed and validated. It is shown that the spatial locality of a program is an invariant of the hierarchy when characterized in this manner. This invariance is used to solve the first problem stated - that of the varying page sizes. An approximate technique is advanced for the characterization of the fault stream as a function of the request stream and the capacity of the level. A procedure is then outlined for evaluating the performance of a multilevel hierarchy analytically.

CSL-TR-75-95

Report Number: CSL-TR-75-96

Institution: Stanford University, Computer Systems Laboratory

Title: A rollback interval for networks with an imperfect self-checking property

Author: Shedletsky, John J.

Date: December 1975

Abstract: Dynamic self-checking is a technique used in computers to detect a fault quickly before extensive data contamination caused by the fault can occur. When the self-checking properties of the computer circuits are not perfect, as in the case with self-testing only and partially self-checking circuits, the recovery procedure may be required to roll back program execution to a point prior to the first undetected data error caused by the detected fault. This paper presents a method by which the rollback distance required to achieve a given probability of successful data restoration may be calculated. To facilitate this method, operational interpretations are given to familiar network properties such as the self-testing, secureness, and self-checking properties. An arithmetic and logic unit with imperfect self-checking capability is analyzed to determine the minimum required rollback distance for the recovery procedure.

CSL-TR-75-96

Report Number: CSL-TR-75-97

Institution: Stanford University, Computer Systems Laboratory

Title: Deterministic sequential networks under random control

Author: Varszegi, Sandor

Date: September 1975

Abstract: This paper presents a network-oriented approach for the treatment of deterministic sequential networks under random control. Considered are the cases of multinomial, stationary Markov and arbitrary input processes. Probabilities of the state and output processes are directly derived from the primary information of the network and the source. Coded networks are treated using the logic circuits or Boolean functions. The isomorphism between Boolean and event algebras is made use of, and the probabilities of the response processes are obtained in the form of algebraic probability expressions interpreted over the determining (i.e., input and initial state) minterm or signal joint probabilities.

CSL-TR-75-97

Report Number: CSL-TR-75-98

Institution: Stanford University, Computer Systems Laboratory

Title: A tale of three emulators

Author: Hoevel, Lee W.

Author: Wallach, Walter A. Jr.

Date: November 1975

Abstract: This is a preliminary report on the development of emulator code for the Stanford EMMY. Emulation is introduced as an interpretive computing technique. Various classes of emulation and their correlation to the image machine are presented. Functional and structural overviews of three emulators for the Stanford EMMY are presented. These are IBM System/360; CRIL; and DELtran. Performance estimates are included for each of these systems.

CSL-TR-75-98

Report Number: CSL-TR-75-101

Institution: Stanford University, Computer Systems Laboratory

Title: A new philosophy for wire routing

Author: Rau, B. Ramakrishna

Date: November 1975

Abstract: A number of interconnection algorithms exist and have been used quite successfully. However, most of them, though differing in detail, appear to subscribe to the same underlying philosophy which has developed from that for single layer boards. Arguments are advanced which question the validity of this philosophy in this environment of multilayer board technology. A new philosophy is developed in this report, which, it is hoped, will be more suited for use with multilayer boards. Based on this philosophy, an interconnection algorithm is then developed in a step by step fashion.

CSL-TR-75-101

Report Number: CSL-TR-75-102

Institution: Stanford University, Computer Systems Laboratory

Title: High performance emulation

Author: Wallach, Walter A. Jr.

Date: November 1975

Abstract: The Stanford EMMY is examined as an emulation engine. Using the 360 emulator and the DELtran interpreter as examples, the performance of the current EMMY architecture is examined as a high performance emulation vehicle. The problems of using a sequential, vertically organized processor for high speed emulation are developed and discussed. A flexible control structure for high speed emulation studies is derived from an existing high performance processor. This structure issues a stream of microinstructions to a central command bus, allowing user-defined execution resources to execute them in overlapped fashion. These execution resources may be added or deleted with little or no processor rewiring.

CSL-TR-75-102

Report Number: CSL-TR-76-106

Institution: Stanford University, Computer Systems Laboratory

Title: Mathematical models for the circuit layout problem

Author: vanCleemput, Willem M.

Date: February 1976

Abstract: In the first part of this paper the basic differences between the classical (placement, routing) and the topological approach to solving the circuit layout problem are outlined. After a brief survey of some existing mathematical models for the problem, an improved model is suggested. This model is based on the concept of partially oriented graph and contains more topological information than earlier models. This reduces the need for special constraints on the graph embedding algorithm. The models also allow pin and gate assignment in function of the layout, under certain conditions.

CSL-TR-76-106

Report Number: CSL-TR-76-108

Institution: Stanford University, Computer Systems Laboratory

Title: Cascade structure in totally self-checking networks

Author: Kolupaev, Stephen

Date: April 1976

Abstract: In the well-known totally self-checking (TSC) network, a failure must not change one output codeword into another. Called the fault-secure property, this permits a receiver of the net's output to assume that any codeword it receives is correct. Further, the self-testing property requires that each possible failure in the net must produce at least one non-code output. Thus a receiver can monitor the health of the network by watching for non-code outputs. In this paper we propose modifications of these two properties. The self-testing property is made more stringent. Each possible failure in the net is required to produce an output which is in a distinguished subset of the non-code outputs. The fault-secure requirement is modified to permit a fault to interchange certain output codewords. In particular, all outputs not in the distinguished subset are partitioned into equivalent classes, and a fault is permitted to change the output from one codeword to another codeword in the same class. However, a fault is not permitted to change the output from a codeword to any member of a different equivalence class (one not containing the correct output) . These modified properties define a generalization of the TSC network. A network which meets the modified properties is called a generalized self-checking (GSC) network. Self-checking and self-testing (Morphic) networks and TSC networks are special cases of the GSC network. Examining TSC networks, we find a further connection with the GSC network. It has been known for some time that not every subnetwork of a TSC network need by TSC. We show that every subnetwork of a TSC network is GSC, and every TSC network is a cascade of GSC networks. This establishes the GSC network as the basic building block from which every TSC network is constructed. We explore a brute-force method for constructing a desired TSC network by cascading GSC subnetworks. The method resorts to enumeration at many points of decision and thus is not a practical design tool. However, it does yield a very nice alternate realization of the Morphic OR, and suggests specializations which merit further study.

CSL-TR-76-108

Report Number: CSL-TR-76-111

Institution: Stanford University, Computer Systems Laboratory

Title: A distributed algorithm for constructing minimal spanning trees in computer-communication networks

Author: Dalal, Yogen K.

Date: June 1976

Abstract: This paper presents a distributed algorithm for constructing minimal spanning trees in computer-communication networks. The algorithm can be executed concurrently and asynchronously by the different computers of the network. This algorithm is also suitable for constructing minimal spanning trees using a multiprocessor computer system. There are many reasons for constructing minimal spanning trees in computer-communication networks since minimal spanning tree routing is useful in distributed operating systems for performing broadcast, in adaptive routing algorithms for transmitting delay estimates, and in other networks like the Packet Radio Network.

CSL-TR-76-111

Report Number: CSL-TR-76-113

Institution: Stanford University, Computer Systems Laboratory

Title: Error correction by alternate-date retry

Author: Shedletsky, John J.

Date: May 1976

Abstract: A new technique for low-cost error correction in computers is the alternate-data retry, (ADR). An ADR is initiated by the detection of an error in the initial execution of an operation. The ADR is a re-execution of the operation, but with an alternate representation of the initial data. The choice of the alternate representation and the design of the processing circuits combine to insure that even an error due to a permanent fault is not repeated during retry. Error-correction is provided at a hardware cost comparable to that of a conventional retry capability. Sufficient conditions are given for the design of circuits with an ADR capability. The application of an ADR capability to memory and to the data paths of a processor is illustrated.

CSL-TR-76-113

Report Number: CSL-TR-76-114

Institution: Stanford University, Computer Systems Laboratory

Title: EMMY/360 functional characteristics

Author: Wallach, Walter A. Jr.

Date: June 1976

Abstract: An emulation of the IBM System/360 architecture is presented - the EMMY/360. Problem state code which executes correctly on an IBM 360 will also execute correctly on the EMMY/360. Code producing execution exceptions will, in most cases, produce the same results on the two systems. Certain exceptions occurring on IBM 360 cannot occur on the EMMY/360, such as address specification exceptions for main store operands, and certain precise interrupts on IBM 360 will be imprecise on the EMMY/360, such as address exceptions. The EMMY/360 supports the Standard 360 instruction set with single precision floating point. The 360 input/output structure is not supported; I/O on the EMMY system is done by Function Call instruction, rather than channel program and Start-Test I/O.

CSL-TR-76-114

Report Number: CSL-TR-76-115

Institution: Stanford University, Computer Systems Laboratory

Title: Principles of self-checking processor design and an example

Author: Wakerly, John F.

Date: December 1975

Abstract: A self-checking processor has redundant hardware to insure that no likely failure can cause undetected errors and all likely failures are detected in normal operation. We show how error-detecting codes and self-checking circuits can be used to achieve these properties in a microprogrammed processor. The choice of error-detecting codes and the placement of checkers to monitor coded data paths are discussed. The use of codes to detect errors in arithmetic and logic operations and microprogram control units is described. An example processor design is given and some observations on the diagnosis and repair of such a processor are made. From the example design it appears that somewhat less than 50% overall redundancy is required to guarantee the detection of all failures that affect a single medium- or large-scale integration circuit package.

CSL-TR-76-115

Report Number: CSL-TR-76-116

Institution: Stanford University, Computer Systems Laboratory

Title: Asynchronous serial interface for connecting a PDP-11 to the ARPANET (BBN 1822)

Author: Crane, Ronald C.

Date: July 1976

Abstract: This report describes an interface to permit the connection of any PDP-11 to either the Packet radio network of the ARPAnet. The interface connects to an IMP on one side, meeting the specifications published in BBN report number 1822, NS RO 16 bit parallel interface (DRV-11 or DR11-C) as described in the DEC peripherals and interfacing handbook. The interface card itself is a double height board (5.2"x 8.5") which can be plugged into any peripheral slot in a PDP-11 backplane. The interface card is connected to the parallel interface card via two cables with Berg 40 pin connectors (DEC H-856) and to the IMP via an Amphenol bayonet connector (48-10R-18-315). All 3 cables and connectors are supplied with the I/O interface card. The parallel interface card (DEC DR11-C or DRV-11) together with the special I/O interface card described in this report comprise the 1822 interface. The report includes descriptions of the operation of circuits, programming, and diagnostics for the 1822 interface.

CSL-TR-76-116

Report Number: CSL-TR-76-117

Institution: Stanford University, Computer Systems Laboratory

Title: An "almost-exact" solution to the N-processor, M-memory bandwidth problem

Author: Rau, B. Ramakrishna

Date: June 1976

Abstract: A closed-form expression is derived for the memory bandwidth obtained when N processors are permitted to generate requests to M memory modules. Use of generating functions is made, in a rather unusual fashion, to obtain this expressio n. The one approximation involved is shown to result in only a very small error -- and that, too, only for small values of M and N. This expression, which is asymptotically exact, is shown to be more accurate than existing closed form approximations. Lastly, a family of asymptotically exact solutions are presented which are easier to evaluate than is the first one. Although these expressions are less accurate than the previously derived closed-form solution, they are, nevertheless, better than existing solutions.

CSL-TR-76-117

Report Number: CSL-TR-76-118

Institution: Stanford University, Computer Systems Laboratory

Title: The Stanford emulation laboratory

Author: Flynn, Michael J.

Author: Hoevel, Lee W.

Author: Neuhauser, Charles J.

Date: June 1976

Abstract: The Stanford Emulation Laboratory is designed to support general research in the area of emulation. Central to the laboratory is a universal host machine, the EMMY, which has been designed specifically to be an unbiased, yet efficient host for a wide range of target machine architectures. Microstore in the EMMY is dynamically microprogrammable and thus is used as the primary data storage resource of the emulator. Other laboratory equipment includes a reconfigurable main memory system and an independent control processor to monitor emulation experiments. Laboratory software, including two microassemblers, is briefly described. Three laboratory applications are described: (1) A conventional target machine emulation (a system 360), (2) 'microscopic' examination of emulated target machine I-streams, and (3) Direct execution of a high level language (Fortran II).

CSL-TR-76-118

Report Number: CSL-TR-76-119

Institution: Stanford University, Computer Systems Laboratory

Title: A simulator for the evaluation of digital system reliability

Author: Thompson, Peter Alan

Date: August 1977

Abstract: This report describes a simulation package designed to evaluate the reliability of digital systems. The simulator can be used to model many different types of systems, at varying levels of detail. The user is given much freedom to use the elements of the model in the way best suited to simulating the operation of a system in the presence of faults. The simulation package then generates random faults in the model, and uses a Monte Carlo analysis to obtain curves of reliability. Three examples are given of simulations of digital systems which have redundancy. The difference between this type of simulation and other simulation techniques is discussed.

CSL-TR-76-119

Report Number: CSL-TR-76-120

Institution: Stanford University, Computer Systems Laboratory

Title: Detection of intermittent faults in sequential circuits

Author: Savir, Jacob

Date: March 1978

Abstract: Testing for intermittent faults in digital circuits has been given significant attention in the past few years. However, very little theoretical work was done regarding their detection in sequential circuits. This paper shows that the testing properties of intermittent faults in sequential circuits can be studied by means of a probabilistic automaton. The evaluation and derivation of optimal intermittent fault detection experiments in sequential circuits is done by creating a product state table from the faulty and fault-free versions of the circuit under test. Both deterministic and random test procedures are discussed. The underlying optimality criterion maximizes the probability of fault detection.

CSL-TR-76-120

Report Number: CSL-TR-76-123

Institution: Stanford University, Computer Systems Laboratory

Title: Research in the Digital Systems Laboratory

Author: Faculty, The

Date: June 1976

Abstract: This report summarizes the research carried out in the Digital Systems Laboratory* at Stanford University during the period August 1975 through July 1976. Research investigations were concentrated into the following major areas: Computer Performance; Computer Reliability Studies, including fault-tolerant computing, evaluation of dual-computer configurations, and implementation of reliable software systems; Computer Architecture, including organization of computer systems, feasibility of real-time emulation, and directly executed languages; Design Automation of Digital Systems; Computer Networks, including network interconnection protocols, the 2000 terminal computing system, and packet-switched network technology/cost studies; LSI Multiprocessors; Compiler Implementation; and Parallel Computer Systems. Renamed Computer Systems Laboratory in 1978.

CSL-TR-76-123

Report Number: CSL-TR-76-125

Institution: Stanford University, Computer Systems Laboratory

Title: Performance bounds for parallel processors

Author: Lee, Ruby Bei-Loh

Date: November 1976

Abstract: A general model of computation on a p-parallel processor is proposed, distinguishing clearly between the logical parallelism (p* processes) inherent in a computation, and the physical parallelism (p processors) available in the computer organization. This shows the dependence of performance bounds on both the computation being executed and the computer architecture. We formally derive necessary and sufficient conditions for the maximum attainable speedup of a p-parallel processor over a uniprocessor to be Sp 2 min( F(p,ln p), F(p*,ln p*)), where ln p approximates Hp , the pth. harmonic number. We also verify that empirically-derived speedups are 0( F(p*,ln p*)). Finally, we discuss related performance measures of minimum execution time, maximum efficiency and minimum space-time product.

CSL-TR-76-125

Report Number: CSL-TR-76-126

Institution: Stanford University, Computer Systems Laboratory

Title: The optimal placement of dynamic recovery checkpoints in recoverable computer systems

Author: Warren-Angelucci, Wayne

Date: December 1976

Abstract: Reliability is an important concern of any computer system. No matter how carefully designed and constructed, computer systems fail. The rapid and systematic restoration of service after an error or malfunction is always a major design and operational goal. In order to overcome the effects of a failure, recovery must be performed to go from the failed sate to an operational state. This thesis describes a recovery method which guarantees that a computer system, its associated data bases and communication transactions will be restored to an operational and consistent state within a given time and cost bound after the occurrence of a system failure. This thesis considers the optimization of a specific software strategy - the rollback and recovery strategy, within the framework of a graph model of program flow which encompasses communication interfaces and data base transactions. Algorithms are developed which optimize the placement of dynamic recovery checkpoints. Presented is a method for statically pre-computing a set of optimal decision parameters for the associated program model, and run-time technique for dynamically determining the optimal placement of program recovery checkpoints.

CSL-TR-76-126

Report Number: CSL-TR-77-129

Institution: Stanford University, Computer Systems Laboratory

Title: On accuracy improvement and applicability conditions of diffusion approximation with applications to modelling of computer systems

Author: Yu, Philip S.

Date: January 1977

Abstract: Starting with single server queueing systems, we find a different way to estimate the diffusion parameters. The boundary condition is handled using the Feller's elementary return process. Extensive comparisons by asymptotic, simulation and numerical techniques have been conducted to establish the superiority of the proposed method compared with conventional methods. The limitation of the diffusion approximation is also investigated. When the coefficient of variation of interarrival time is larger than one, the mean queue length may vary over a wide range even if the mean and variance of interarrival time are kept unchanged. The diffusion approximation is applicable under the condition that the high variation of interarrival time conducted on 2-stage hyperexponential distributions. A similar anomaly is observed in two server closed queueing networks when the service time of any server has a large coefficient of variation. Again, a similar regularity condition on the service time distribution is required in order for the diffusion approximation to be applicable. For general queueing networks, the problems become more complicated. A simple way to estimate the coefficient of variation of interarrival time (when the network is decomposable) is proposed. Besides the anomalies cited before, networks under certain topologies, such as networks with feedback loops, especially self loops, can not be decomposed into separate single servers when the coefficient of variation of service time distributions become large, even if the large variations are due to a large number of short service times. Nevertheless, the decomposability of a network can be improved by replacing each server with a self loop by an equivalent server without a self loop. Finally, we consider the service center with a queue dependent service rate or arrival rate. Generalization to two server closed queueing networks where each server may have a self loop is also considered.

CSL-TR-77-129

Report Number: CSL-TR-77-130

Institution: Stanford University, Computer Systems Laboratory

Title: The structure of directly executed languages: a new theory of interpretive system design

Author: Hoevel, Lee W.

Author: Flynn, Michael J.

Date: March 1977

Abstract: This paper concerns two important issues in the design of optimal languages for direct execution in an interpretive system: binding the operand identifiers in an executable instruction unit to the arguments of the routine implementingthe operator defined by that instruction; and binding operand identifiers to execution variables. These issues are central to the performance of a system both in space and time. Historically, some form of "machine language" is used as the directly executable medium for a computing system. These languages traditionally are constrained to a single "n-address" instruction format; this leads to an excessive number of "overhead" instructions that do nothing but move values from one storage resource to another being imbedded in the executable instruction stream. We propose to reduce this overhead by increasing the number of instruction formats available at the directly executed language level. Machine languages are also constricted with respect to the manner in which operands can be "addressed" within an instruction. Usually, some form of indexed base-register scheme is available, along with a direct addressing mechanism for a few, "special" storage cells (i.e., registers and perhaps the zeroth page of main store). We propose a different identification mechanism--based on the Contour Model of Johnston. Using our scheme, only N bits are needed to encode any identifier in a scope containing less than 2**N distinct identifiers. Together, these two results lead to directly executed language designs which are optimal in the sense that: (1) k executable instructions are required to implement a source statement containing k functional operators; (2) the space required to represent the executable form of a source statement containing k distinct functional operators and v distinct variables approaches F*k + N*v -- where there are less than 2**F distinct functional operators in the scope of definition for the source statement, and less than 2**N distinct variables in this scope. (3) the time needed to execute the representation of a source statement containing k functional operators, d distinct variables in its domain, and r distinct variables in its range approaches d + r + k ; where time is measured in memory references.

CSL-TR-77-130

Report Number: CSL-TR-77-131

Institution: Stanford University, Computer Systems Laboratory

Title: Sequential prefetch strategies for instructions and data

Author: Rau, B. Ramakrishna

Date: January 1977

Abstract: An investigation of sequential prefetch as a means of reducing the average access time is conducted. The use of a target instruction buffer is shown to enhance the performance of instruction prefetch. The concept of generalized sequentiality is developed to enable the study of sequentiality in data streams. Generalized sequentiality is shown to be present to a significant degree in data streams from measurements on representative programs. This results is utilized to develop a data prefetch mechanism which is found to be capable of anticipating, on the average, about 75% of all data requests.

CSL-TR-77-131

Report Number: CSL-TR-77-132

Institution: Stanford University, Computer Systems Laboratory

Title: Manual for a general purpose simulator used to evaluate reliability of digital systems

Author: Thompson, Peter A.

Date: August 1977

Abstract: A simulation technique has been developed for the reliability evaluation of arbitrarily defined computer systems. The main simulation program is written in FORTRAN IV, and requires no changes to simulate many different systems. The user defines a model for a particular system by supplying a set of short FORTRAN subroutines, and a specially formatted block of numerical parameters. The subroutines specify the functional behavior of various subsystems comprising the model, while the numerical parameters describe how the subsystems are interconnected, their time delays what faults occur in each one, etc. The main simulation program uses this model to perform a Monte-Carlo type evaluation of the systems' reliability. This report supplements a basic description of the technique by supplying all the details necessary for writing subroutines, specifying numerical parameters, and using the main simulation program. The simulation is event-driven, and automatically generates pseudo-random faults and time delays according to parameters given by the user. Some problems typical of event simulators, such as ambiguities arising from random time-delay generation, can be solved by taking advantage of special facilities built into the simulation package. A complete source listing of the main program is included for reference.

CSL-TR-77-132

Report Number: CSL-TR-77-134

Institution: Stanford University, Computer Systems Laboratory

Title: Design of two-level fault-tolerant networks form threshold elements

Author: Butakov, Evguenij A.

Author: Posherstnick, Marat S.

Date: March 1977

Abstract: Only a small part of all Boolean functions of n-variables can be realized by one threshold element (T.E.). For all other functions the net must be built with at least two T.E.'s. The problem of constructing a fault-tolerant two-level network from T.E. is investigated. The notion of limiting function is introduced. It is shown that the use of these limiting functions induces a reduction in the number of possible candidates during the process of finding a realization of an arbitrary function by threshold functions. The method is based on the two-asummability property of threshold functions and therefore is applicable to completely specified Boolean functions with less than nine variables.

CSL-TR-77-134

Report Number: CSL-TR-77-135

Institution: Stanford University, Computer Systems Laboratory

Title: Passage time distributions for a class of queueing networks: closed, open, or mixed, with difference classes of customers with applications to computer system modeling

Author: Yu, Philip S.

Date: March 1977

Abstract: Networks of queues are important models of multiprogrammed time-shared computer systems and computer communication networks. Although equilibrium state probabilities of a broad class of network models have been derived in the past, analytic or approximate solutions for response time distributions or more general passage time distributions are still open problems. In this paper we formulate the passage time problem as a "hitting time" or "first passage time" problem in a Markov system and derive the analytic solution to passage time distributions of closed queueing networks. Efficient numerical approximation is also proposed. The result for closed queueing networks is further extended to obtain approximate passage time distributions for open queueing networks. Finally, we employ the techniques derived in this paper to study the interfault time and response time distribution and density functions of multiprogramming, size of main memory, service time of paging devices and rate of file I/O requests on the shape of distribution functions and density functions have been examined.

CSL-TR-77-135

Report Number: CSL-TR-77-136

Institution: Stanford University, Computer Systems Laboratory

Title: A structural design language for computer aided design of digital systems

Author: vanCleemput, Willem M.

Date: April 1977

Abstract: In this report a language (SDL) for describing structural properties of digital systems will be presented. SDL can be used at all levels of the design process, i.e. from the system level down to the circuit level. The language is intended as a complement to existing computer hardware description languages, which emphasize behavioral description. The language was motivated partly by the nature of the design process.

CSL-TR-77-136

Report Number: CSL-TR-77-137

Institution: Stanford University, Computer Systems Laboratory

Title: Performance analysis of computer communication networks via random access channels

Author: Yu, Philip S.

Date: April 1977

Abstract: The field of computer communication networks has grown very rapidly in the past few years. One way to communicate is via multiple access broadcast channels. A new class of random access schemes referred to as the Mp-persistent CSMA scheme is proposed. It incorporates the nonpersistent CSMA scheme and the 1-persistent CSMA scheme, both slotted and unslotted versions, as its special cases with p=0 and 1, respectively. The performance of the Mp-persistent CSMA scheme under packet switching is analyzed and compared with other random access schemes. By dynamically adjusting p, the unslotted version can achieve better performance in both throughput and delay than the currently available unslotted CSMA schemes under packet switching. Furthermore, the performance of various random access schemes under message switching is analyzed and compared with that under packet switching. In both slotted and unslotted versions of the M0-persistent CSMA scheme, the performance under message switching is superior to that under packet switching in the sense that not only the channel capacity is larger but also the average number of retransmissions per successful message under message switching is smaller than that per successful packet under packet switching. In dynamic reservation schemes, message switching leads to larger channel capacity. However, in both slotted and unslotted versions of the ALOHA scheme, the channel capacity is reduced when message switching is used instead of packet switching. This phenomenon may also happen in the Mp-persistent CSMA scheme as p deviates from 0 to 1 for certain distributions of message length. Hence, the performance under message switching may be superior to or inferior to that under packet switching depending upon the random access scheme being used and the distribution of message length (usually a large coefficient of variation of message length implies a large degradation of channel capacity in this case) for certain random access schemes. Nevertheless, for radio channels, message switching can achieve larger channel capacity if appropriate CSMA schemes are used. A mixed strategy which is a combination of message switching and packet switching is proposed to improve the performance of a point to point computer communication network when its terminal access networks communicate via highly utilized radio channels.

CSL-TR-77-137

Report Number: CSL-TR-77-138

Institution: Stanford University, Computer Systems Laboratory

Title: Program behavior and the performance of interleaved memories

Author: Rau, B. Ramakrishna

Date: May 1977

Abstract: One of the major factors influencing the performance of an interleaved memory system is the behavior of the request sequence, but this is normally ignored. This report examines this issue. Using trace driven simulations it is shown that the commonly used assumption, that all requests are equally likely to be to any module, is not valid. The duality of memory interference with paging is noted and this suggests the use of the Least-Recently-Used Stack Model to model program behavior. Simulation shows that this model is quite successful. An accurate expression for the bandwidth is derived based upon this model.

CSL-TR-77-138

Report Number: CSL-TR-77-139

Institution: Stanford University, Computer Systems Laboratory

Title: Properties and applications of the least-recently-used stack model

Author: Rau, B. Ramakrishna

Date: May 1977

Abstract: The Least-Recently-Used Stack Model (LRUSM) is known to be a good model of temporal locality. Yet, little analysis of this model has been performed and documented. Certain properties of the LRUSM are developed here. In particular, the concept of the Stack Working Set is introduced and expressions are derived for the forward recurrence time to the next reference to a page, for the time that a page spends in a cache of a given size and for the time from last reference to the page being replaced. The fault stream out of a cache memory is modelled and it is shown how this can be used to partially analyze a multilevel memory hierarchy. In addition, the Set Associative Buffer is analyzed and a necessary and sufficient condition for the optimality of the LRU replacement algorithm is advanced.

CSL-TR-77-139

Report Number: CSL-TR-77-142

Institution: Stanford University, Computer Systems Laboratory

Title: Optimal layout of CMOS functional arrays

Author: Uehara, T.

Author: vanCleemput, Willem M.

Date: March 1978

Abstract: Designers of MOS LSI circuits can take advantage of complex functional cells in order to achieve better performance. This paper discusses the implementation of a random logic function on an array of CMOS transistors. A graph-theoretical algorithm which minimizes the size of an array is presented. This method is useful for the design of cells used in conventional design automation systems.

CSL-TR-77-142

Report Number: CSL-TR-77-143

Institution: Stanford University, Computer Systems Laboratory

Title: SPRINT - an interactive system for printed circuit board design user's guide

Author: vanCleemput, Willem M.

Author: Bennett, T. C.

Author: Hupp, J. A.

Author: Stevens, K. R.

Date: June 1977

Abstract: The SPRINT system; for the design of printed circuit boards is a collection of programs that allows designers to interactively design two-sided boards using a Tektronix 4013 graphics terminal. The major parts of the system are: a compiler for SDL, the Structure Design Language, an interactive component placement program, an interactive manual conductor routing program, an automatic batch router, a via elimination program and a set of artwork generation programs.

CSL-TR-77-143

Report Number: CSL-TR-77-147

Institution: Stanford University, Computer Systems Laboratory

Title: Verifying concurrent programs with shared date classes

Author: Owicki, Susan S.

Date: August 1977

Abstract: Monitors are a valuable tool for organizing operations on shared data in concurrent programs. In some cases, however, the mutually exclusive procedure calls provided by monitors are overly restrictive. Such applications can be programmed using shared classes, which do not enforce mutual exclusion. This paper presents a method of verifying parallel programs containing shared classes. One first proves that each class procedure performs correctly when executed by itself, then shows that simultaneous execution of other class procedures can not interfere with its correct operation. Once a class has been verified, calls to its procedures may be treated as uninterruptable actions; this simplifies the proof of higher-level program components. Proof rules for classes and procedure calls are given in Hoare's axiomatic style. Several examples are verified, including two versions of the readers and writers problem and a dynamic resource allocator.

CSL-TR-77-147

Report Number: CSL-TR-77-149

Institution: Stanford University, Computer Systems Laboratory

Title: Interpretive machines

Author: Iliffe, John K.

Date: June 1977

Abstract: These lectures survey attempts to apply computers directly to high level languages using microprogrammed interpreters. The motivation for such work is to achieve language implementations that are more effective in some measure of translation, execution or response to the user than would otherwise be obtained. The implied comparison is with the established technique of compiling into a fixed general-purpose machine code prior to execution. It is argued that while substantial benefits can be expected from microprogramming it does not represent the best approach to design when the contributing factors are analyzed in a general system context, that is to say when wide performance range, multiple source language, and stringent security requirements have to be satisfied. An alternative is suggested, using a combination of interpretation and a primitive instruction set and providing security at the microprogram level.

CSL-TR-77-149

Report Number: CSL-TR-77-150

Institution: Stanford University, Computer Systems Laboratory

Title: Research in the Digital Systems Laboratory: August 1976-July 1977

Author: Staff

Date: July 1977

Abstract: This report summarizes the research carried out in the Digital Systems Laboratory at Stanford University during the period from August 1976 through July 1977. Research investigations were concentrated in the following areas: Computer Reliability and Testing, including detection of intermittent failures, testing for sequential circuits, self-checking linear feedback shift registers, simulation analysis of high-reliability systems, effects of failures on gracefully degradable systems, fault diagnosis in digital systems, and software reliability; Critical Fault-Pattern Determination; Computer Architecture, including trace facility, memory interleaving, and monitors for signal activity; Organization of Computer Systems, including an emulation research laboratory, emulators, and memory performance; Feasibility of Real-Time Emulation, including directly executable languages; Distributed Date Processing for Ballistic Missile Defense; Description Languages and Design for General-Purpose Computer Architectures, including evaluation of existing hardware description languages, development of a structural description language, applications of the structural design language, bounds for maximal parallelism, and parallel information processing in bilogical systems; Computer Networks, including broadcast protocols in packet-switched computer networks and the optimal placement of dynamic-recovery checkpoints in recoverable computer systems; Design and Verification of Reliable Software including specifications and proofs for abstract data types in concurrent programs, specification and verification of monitors, and operating system design; Design Automation, including a language for describing the structure of digital systems, the SPRINT printed-circuit design system, computer-aided layout of large-scale integrated circuits, and an interactive system for design capture; Database, including studies in distributed processing and problem solving, a database maintenance system, and the implementation of database in medicine; and Digital Incremental Computers. Renamed Computer Systems Laboratory in 1978.

CSL-TR-77-150

Report Number: CSL-TR-78-154

Institution: Stanford University, Computer Systems Laboratory

Title: Notes on modelling of computer systems and networks

Author: Yu, Philip S.

Date: April 1978

Abstract: Formulation of given computer system or network problems into abstract stochastic models is considered. Generally speaking, model formulation is an art. While analytic results are clearly not powerful enough to provide a "cookbook" approach to modelling, general methodology and difficulties on model formulation are discussed through examination of various computer system and network models. These models are presented in a systematic way based on the hierarchical approach.

CSL-TR-78-154

Report Number: CSL-TR-78-155

Institution: Stanford University, Computer Systems Laboratory

Title: The formal definition of a real-time language

Author: Hennessy, John L.

Author: Kieburtz, Richard B.

Date: July 1978

Abstract: This paper presents the formal definition of TOMAL (Task-Oriented Microprocessor Application Language), a programming language intended for real-time systems running on small processors. The formal definition addresses all aspects of the language. Because some modes of semantic definition seem particularly well-suited to certain aspects of a language, and not as suitable for others, the formal definition employs several, complementary modes of definition. The primary definition is axiomatic in the notation of Hoare; it is employed to define most of the transformations of data control states affected by statements of the language. Simple, denotational (but not lattice-theoretic) semantics complement the axiomatic semantics to define type-related features, such as the finding of names to types, data type coercions, and the evaluation of expressions. Together, the axiomatic and denotational semantics define all the features of the sequential language. An operational definition, not included in the paper, is used to define real-time execution, and to extend the axiomatic definition to account for all aspects of concurrent execution. Semantic constraints, sufficient to guarantee conformity of a program with the axiomatic definition, can be checked by analysis of a TOMAL program at compilation.

CSL-TR-78-155

Report Number: CSL-TR-78-156

Institution: Stanford University, Computer Systems Laboratory

Title: Optimal program control structures based on the concept of decision entropy

Author: Lee, Ruby Bei-Loh

Date: July 1978

Abstract: The ability to make decisions dynamically during program execution is a very powerful and valuable tool. Unfortunately, it also causes severe performance degradations in high-speed computer organizations which use parallel, pipelined or lookahead techniques to speed up program execution. An optimal control structure is one where the average number of decisions to be made during program execution is minimal among all control structures for the program. Since decisions are usually represented by conditional branch instructions, finding an optimal control structure is equivalent to minimizing the expected number of conditional branch instructions to be encountered per program execution. By decision entropy, we mean a quantitative characterization of the uncertainty in the instruction stream due to dynamic decisions imbedded in the program. We define this concept of decision entropy in the Channon information-theoretic sense. We show that a program's intrinsic decision entropy is an absolute lower bound on the expected number of decisions, or conditional branch instructions, per program execution. We show that this lower bound is achieved if each decision has maximum uncertainty. We also indicate how optimal control structures may be constructed.

CSL-TR-78-156

Report Number: CSL-TR-78-157

Institution: Stanford University, Computer Systems Laboratory

Title: Syndrome-testable design of combinational circuits

Author: Savir, Jacob

Date: October 1978

Abstract: Classical testing of combinational circuits requires a list of the fault-free responses of the circuit to the test set. For most practical circuits implemented today the large storage requirement for such a list make such a test procedure very expensive. In this paper we describe a method of designing combinational circuits in such a way that their test procedure will require the knowledge of only one characteristic of the fault-free circuit, called the syndrome. This solves the storage problem associated with the test procedure. It is shown that the syndrome-testable design is inexpensive and can be easily implemented by the logic designer.

CSL-TR-78-157

Report Number: CSL-TR-78-158

Institution: Stanford University, Computer Systems Laboratory

Title: Performance characterization of parallel computations

Author: Lee, Ruby Bei-Loh

Date: September 1978

Abstract: This paper defines and interprets quantitative measures by which we may characterize the absolute and relative performance of a parallel computation, compared with an equivalent serial computation. The absolute performance measures are the Parallelism Index, PI(P), the Utilization, U(P), and the maximum Quality, Q(P). The corresponding relative performance measures are the Speedup, S(P,1), the Efficiency, E(P,1), and the Quality, Q(P,1). We show how the corresponding absolute and relative performance measures are related via the Redundancy measure, R(P,1). We also examine the range of permissible values for each performance measure. Ideally, we would like to compare an optimal parallel computation with an optimal equivalent serial computation, in order to determine the performance improvements due solely to parallel versus serial processing. Toward this end, we define optimal parallel and serial computations, and show such optimality may be approximated in practice. In order to facilitate the calculation of the above performance measures, we show how the complexity of modelling an arbitrary parallel computation may be reduced substantially to two simple canonical forms, which we denote the computation's Parallelism Profile and TOP-form. Finally we show how all the canonical forms and performance measures may be generalized from one computation to a set of computations, to arrive at aggregate canonical and performance descriptions.

CSL-TR-78-158

Report Number: CSL-TR-78-159

Institution: Stanford University, Computer Systems Laboratory

Title: Specification and verification of network mail system

Author: Owicki, Susan S.

Date: November 1978

Abstract: Techniques for describing and verifying modular systems are illustrated using a simple network mail problem. The design is presented in a top-down style. At each level of refinement, the specifications of the higher level are verified from the specifications of lower level components.

CSL-TR-78-159

Report Number: CSL-TR-78-163

Institution: Stanford University, Computer Systems Laboratory

Title: An introduction to the DDL-P language

Author: Cory, Wendell E.

Author: Duley, J. R.

Author: vanCleemput, Willem M.

Date: March 1979

Abstract: This report describes the Pascal-based implementation of DDL (Digital Design Language) and its simulator.

CSL-TR-78-163

Report Number: CSL-TR-78-164

Institution: Stanford University, Computer Systems Laboratory

Title: DDL-P command language manual

Author: Cory, Wendell E.

Author: Duley, J. R.

Author: vanCleemput, Willem M.

Date: March 1979

Abstract: This report describes the command language for the simulator, associated with DDL (Digital Design Language).

CSL-TR-78-164

Report Number: CSL-TR-78-165

Institution: Stanford University, Computer Systems Laboratory

Title: Partitioning of digital systems

Author: Mei, K.

Author: vanCleemput, Willem M.

Author: Blount, M.

Author: Hanson, D. L.

Author: Payne, Thomas S.

Author: Savir, Jacob

Author: Scheffer, L. K.

Date: April 1979

Abstract: The aim of this study is to develop concepts and tools for understanding the influence of partitioning on the life-cycle cost of a system. Throughout this study three types of boards will be considered as examples to illustrate the concepts being developed. These three board types are being used by the U.S. Navy for various types of equipment. The types considered are: Type 1A: A small PC card with space for up to 8 IC's and a single 40-pin connector. Type 2A: A PC card with space for up to 18 IC's and a single 100-pin connector. Type 5X: A PC card with space for up to 55 IC's and two connectors: a 100-pin connector for the back plane connection and a 30-pin test point connector to be used for diagnostic purposes only.

CSL-TR-78-165

Report Number: CSL-TR-79-168

Institution: Stanford University, Computer Systems Laboratory

Title: UFORT: a Fortran-to-Universal PCODE Translator (FIXFOR-2)

Author: Chow, Frederick

Author: Nye, Peter

Author: Wiederhold, Gio

Date: January 1980

Abstract: The Fortran compiler described in this document, UFORT, was written specifically to serve in a Pascal environment using the Universal P-Code as an intermediate pseudomachine. The need for implementation of Fortran these days is due to the great volume of existing Fortran programs, rather than to a desire to have this language available to develop new programs. We have hence implemented the full, but traditional Fortran standard, rather than the recently adopted augmented Fortran standard. All aspects of Fortran which are commonly used in large scientific programs are available, including such features as SUBROUTINES, labelled COMMON, and COMPLEX arithmetic. In addition, a few common extensions, such as integers of different lengths and assignment of strings to variables, have been added.

CSL-TR-79-168

Report Number: CSL-TR-79-170

Institution: Stanford University, Computer Systems Laboratory

Title: Interpretive architectures: a theory of ideal language machines

Author: Flynn, Michael J.

Author: Hoevel, Lee

Date: February 1979

Abstract: This paper is a study in ideal computer architectures or program representations. An ideal architecture can be defined with respect to the representation that was used to originally describe a program, i.e. the higher level language. Traditional machine architectures name operations and objects which are presumed to be present in the host machine: a memory space of certain size, ALU operations, etc. An ideal machine framed about a specific higher level language assumes operations present in that language and uses these operations to describe relationships between objects described in the source representation. The notion of ideal is carefully constrained. The object program representation must be easily decompilable, (i.e. the source is readily reconstructable). It is simply assumed that the source itself is a good representation for the original problem, thus any nonassignment operation present in the source program statement will appear as a single instruction (operation) in the ideal representation. All named objects are defined with respect to the natural scope of definition of the source program. For simplicity of discussion, statistical behavior of the program or the language is assumed to be unknown; that is, Huffman codes are not used. From the above, a canonic interpretive form (CIF) or measure of a higher level language program is developed. CIF measures both static space to represent the program and dynamic time measurements of the number of instructions to be interpreted and the number of memory references these instructions will require. The CIF or ideal program representation is then compared using the Whetstone benchmark in its characteristics to several contemporary architectural approaches; IBM 370 Honeywell Level 66, Burroughs S-Language Fortran and DELtran, a quasi-ideal Fortran architecture based on CIF principles.

CSL-TR-79-170

Report Number: CSL-TR-79-171

Institution: Stanford University, Computer Systems Laboratory

Title: A theory of interpretive architectures: some notes on DEL design and a Fortran case study

Author: Hoevel, Lee

Author: Flynn, Michael J.

Date: February 1979

Abstract: An interpretive architecture is a program representation that peculiarly suits a particular high level language or class of languages. The architecture is a program representation which we call a directly executed language (DEL). In a companion paper we have explored the theory involved in the creation of ideal DEL forms and have analyzed how some traditional instruction sets compare to this measure. This paper is an attempt to develop a reasonably comprehensive theory of DEL synthesis. By assuming a flexible interpretation oriented host machine, synthesis involves three particular areas: (1) sequencing; both between image machine instructions and within the host interpreter, (2) action rules including both format for transformation and operation invoked, and finally, (3) the name space which includes both name structure and name environment. A complete implementation of a simple version of FORTRAN is described in the appendix of the paper. This DEL for FORTRAN called DELtran comes close to achieving the ideal program measures.

CSL-TR-79-171

Report Number: CSL-TR-79-174

Institution: Stanford University, Computer Systems Laboratory

Title: Pascal*: a Pascal based systems programming language

Author: Hennessy, John L.

Date: June 1980

Abstract: Pascal* (Pascal-star) is a new programming language which is upward compatible with standard Pascal and suitable for systems programming. Although there are several additions to the language, simplicity remains a major design goal. The major additions reflect trends evident in newer languages such as Euclid, Mesa, and Ada, including: modules, simple parametric types, structures constants and values, several minor extensions to the control structures of the language, random access files, arbitrary return types for functions, and an exception handling mechanism.

CSL-TR-79-174

Report Number: CSL-TR-79-175

Institution: Stanford University, Computer Systems Laboratory

Title: Symbolic debugging of optimized code

Author: Hennessy, John L.

Date: July 1979

Abstract: The long standing conflict between the optimization of code and the ability to symbolically debug the code is examined. The effects of local and global optimizations on the variables of a program are categorized and models for representing the effect of optimizations are given. These models are used by algorithms which determine the subset of variables whose values do not correspond to those in the original program. Algorithms for restoring these variables to their correct values are also developed. Empirical results from the application of these algorithms to local optimization are presented.

CSL-TR-79-175

Report Number: CSL-TR-79-176

Institution: Stanford University, Computer Systems Laboratory

Title: SYNDIA user's guide

Author: Cory, Wendell E.

Date: August 1979

Abstract: This report describes how to use the Syndia/Syngra system available at SU-SCORE. This system accepts a BNF-like grammar; specifications and automatically generates syntax diagrams on a Tektronix graphics terminal. Syndia is the major component of this system; Syndgra acts acts as an interface between Syndia and the SUDS2 graphics editor. Syndia performs no ambiguity or consistency checks on the BNF input. This report assumes that the reader is familiar with BNF and syntax diagram representations of grammars.

CSL-TR-79-176

Report Number: CSL-TR-79-177

Institution: Stanford University, Computer Systems Laboratory

Title: ADLIB user's manual

Author: Hill, Dwight D.

Date: August 1979

Abstract: ADLIB (A Design Language for Indicating Behavior) is a new computer design language recently developed at Stanford. ADLIB is a superset of PASCAL with special facilities for concurrency and interprocess communication. It is normally used under the SABLE simulation system.

CSL-TR-79-177

Report Number: CSL-TR-79-178

Institution: Stanford University, Computer Systems Laboratory

Title: Design automation at Stanford

Author: vanCleemput, Willem M.

Date: July 1979

Abstract: This report contains a copy of the visual aids used by the authors during the presentation of their work at the First Workshop on Design Automation at Stanford, held July 3-4, 1979. The topics covered a range from circuit level simulation and integrated circuit process modelling to high level languages and design techniques. The presentations are a survey of the activities in design automation at Stanford University.

CSL-TR-79-178

Report Number: CSL-TR-79-179

Institution: Stanford University, Computer Systems Laboratory

Title: Testability considerations in microprocessor-based design

Author: Hayes, John P.

Author: McCluskey, Edward J.

Date: November 1979

Abstract: This report contains a survey of testability conditions in microprocessor-based design. General issues of testability, testing methods, and fault modeling are presented. Specific techniques of testing and designing for testable microprocessor-based systems are discussed.

CSL-TR-79-179