%R CS-TN-93-1
%Z Wed, 08 Dec 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Incremental Updates of Inverted Lists for Text Document
               Retrieval
%A Tomasic, Anthony
%A Garcia-Molina, Hector
%A Shoens, Kurt
%D December 1993
%X With the proliferation of the world's "information
               highways" a renewed interest in efficient document indexing
               techniques has come about. In this paper, the problem of
               incremental updates of inverted lists is addressed using a
               new dual-structure index data structure. The index
               dynamically separates long and short inverted lists and
               optimizes the retrieval, update, and storage of each type of
               list. To study the behavior of the index, a space of
               engineering trade-offs which range from optimizing update
               time to optimizing query performance is described. We
               quantitatively explore this space by using actual data and
               hardware in combination with a simulation of an information
               retrieval system. We then describe the best algorithm for a
               variety of criteria.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/93/1/CS-TN-93-1.pdf

%R CS-TN-93-2
%Z Wed, 08 Dec 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Efficacy of GlOSS for the Text Database Discovery Problem
%A Gravano, Luis
%A Garcia-Molina, Hector
%A Tomasic, Anthony
%D December 1993
%X The popularity of information retrieval has led users to a
               new problem: finding which text databases (out of thousands
               of candidate choices) are the most relevant to a user.
               Answering a given query with a list of relevant databases is
               the text database discovery problem. The first part of this
               paper presents a practical method for attacking this problem
               based on estimating the result size of a query and a
               database. The method is termed GlOSS--Glossary of Servers
               Server. The second part of this paper evaluates GlOSS using
               four different semantics to answer a user's queries. Real
               users' queries were used in the experiments. We also describe
               several variations of GlOSS and compare their efficacy. In
               addition, we analyze the storage cost of our approach to the
               problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/93/2/CS-TN-93-2.pdf

%R CS-TN-93-3
%Z Wed, 08 Dec 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Correct View Update Translations via Containment
%A Tomasic, Anthony
%D December 1993
%X One approach to the view update problem for deductive
               databases proves properties of translations - that is, a
               language specifies the meaning of an update to the
               intensional database (IDB) in terms of updates to the
               extensional database (EDB). We argue that the view update
               problem should be viewed as a question of the expressive
               power of the translation language and the computational cost
               of demonstrating properties of a translation. We use an
               active rule based database language as a means of specifying
               translations of updates on the IDB into updates on the EDB.
               This paper uses the containment of one datalog program (or
               conjunctive query) by another to demonstrate that a
               translation is semantically correct. We show that the
               complexity of correctness is lower for insertion than
               deletion. Finally, we discuss extension to the translation
               language.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/93/3/CS-TN-93-3.pdf

%R CS-TN-94-6
%Z Tue, 10 May 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Ascribing Beliefs
%A Brafman, Ronen I.
%A Tennenholtz, Moshe
%D December 1993
%X Models of agents that employ formal notions of mental states
               are useful and often easier to construct than models at the
               symbol (e.g., programming language) or physical (e.g.,
               mechanical) level. In order to enjoy these benefits, we must
               supply a coherent picture of mental-level models, that is, a
               description of the various components of the mental level,
               their dynamics and their inter-relations. However, these
               abstractions provide weak modelling tools unless (1) they are
               grounded in more concrete notions; and (2) we can show when
               it is appropriate to use them. In this paper we propose a
               model that grounds the mental state of the agent in its
               actions. We then characterize a class of {\em goal-seeking\/}
               agents that can be modelled as having beliefs. 
               This paper emphasizes the task of belief ascription. On one
               level this is the practical task of deducing an agent's
               beliefs, and we look at assumptions that can help constrain
               the set of beliefs an agent can be ascribed, showing cases in
               which, under these assumptions, this set is unique. We also
               investigate the computational complexity of this task,
               characterizing a class of agents to whom belief ascription is
               tractable. But on a deeper level, our model of belief
               ascription supplies concrete semantics to beliefs, one that
               is grounded in an observable notion -- action.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/6/CS-TN-94-6.pdf

%R CS-TN-94-7
%Z Tue, 10 May 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Overcoming Unexpected Obstacles
%A McCarthy, John
%D May 1994
%X The present note illustrates how logical formalizations of
               common sense knowledge and reasoning can achieve some of the
               open-endedness of human common sense reasoning. A plan is
               made to fly from Glasgow to Moscow and is shown by
               circumscription to lead to the traveller arriving in Moscow.
               Then a fact about an unexpected obstacle---the traveller
               losing his ticket---is added without changing any of the
               previous facts, and the original plan can no longer be shown
               to work if it must take into account the new fact. However,
               an altered plan that includes buying a replacement ticket can
               now be shown to work. The formalism used is a modification of
               one developed by Vladimir Lifschitz, and I have been informed
               that the modification isn't correct, and I should go back to
               Lifschitz's original formalism.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/7/CS-TN-94-7.pdf

%R CS-TN-94-8
%Z Tue, 10 May 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Emulating Soft Real-Time Scheduling Using Traditional
               Operating System Schedulers
%A Adelberg, Brad
%A Garcia-Molina, Hector
%A Kao, Ben
%D May 1994
%X Real-time scheduling algorithms are usually only available in
               the kernels of real-time operating systems, and not in more
               general purpose operating systems, like Unix. For some soft
               real-time problems, a traditional operating system may be the
               development platform of choice. This paper addresses methods
               of emulating real-time scheduling algorithms on top of
               standard time-share schedulers. We examine (through
               simulations) three strategies for priority assignment within
               a traditional multi-tasking environment. The results show
               that the emulation algorithms are comparable in performance
               to the real-time algorithms and in some instances outperform
               them.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/8/CS-TN-94-8.pdf

%R CS-TN-94-9
%Z Tue, 07 Jun 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reasoning About The Effects of Communication On Beliefs
%A Young, R. Michael
%D June 1994
%X Perrault has presented a formal framework describing
               communicative action and the change of mental state of agents
               participating in the performance of speech acts. This
               approach, using an axiomatization in default logic, suffers
               from several drawbacks dealing with the persistence of
               beliefs and ignorance over time. We provide an example which
               illustrates these drawbacks and then present a second
               approach which avoids these problems. 
               This second approach, an axiomatization of belief transfer in
               a nonmonotonic modal logic of belief and time, is a
               reformulation of Perrault's main ideas within a logic which
               uses an ignorance-based semantics to ensure that ignorance is
               maximized. We present an axiomatization of this logic and
               describe the associated techniques for nonmonotonic
               reasoning. We then show how this approach deals with
               inter-agent communications in a more intuitively appealing
               way.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/9/CS-TN-94-9.pdf

%R CS-TN-94-10
%Z Tue, 12 Jul 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Precision and Recall of GlOSS Estimators for Database
               Discovery
%A Tomasic, Anthony
%A Gravano, Luis
%A Garcia-Molina, Hector
%D July 1994
%X The availability of large numbers of network information
               sources has led to a new problem: finding which text
               databases (out of perhaps thousands of choices) are the most
               relevant to a query. We call this the text-database discovery
               problem. Our solution to this problem,
               GlOSS--Glossary-Of-Servers Server, keeps statistics on the
               available databases to decide which ones are potentially
               useful for a given query. In this paper we present different
               query-result size estimators for GlOSS and we evaluate them
               with metrics based on the precision and recall concepts of
               text-document information-retrieval theory. Our
               generalization of these metrics uses different notions of the
               set of relevant databases to define different query
               semantics.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/10/CS-TN-94-10.pdf

%R CS-TN-94-11
%Z Mon, 19 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On Exact and Approximate Cut Covers of Graphs
%A Motwani, Rajeev
%A Naor, Joseph
%D September 1994
%X We consider the minimum cut cover problem for a simple,
               undirected graphs G(V,E): find a minimum cardinality family
               of cuts C in G such that each edge e in E belongs to at least
               one cut c in C. The cardinality of the minimum cut cover of G
               is denoted by c(G). The motivation for this problem comes
               from testing of electronic component boards.
               Loulou showed that the cardinality of a minimum cut cover in
               the complete graph is precisely the ceiling of log n.
               However, determining the minimum cut cover of an arbitrary
               graph was posed as an open problem by Loulou. In this note we
               settle this open problem by showing that the cut cover
               problem is closely related to the graph coloring problem,
               thereby also obtaining a simple proof of Loulou's main
               result. We show that the problem is NP-complete in general,
               and moreover, the approximation version of this problem still
               remains NP-complete. Some other observations are made, all of
               which follow as a consequence of the close connection to
               graph coloring.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/11/CS-TN-94-11.pdf

%R CS-TN-94-12
%Z Mon, 10 Oct 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Cross-Validated C4.5: Using Error Estimation for Automatic
               Parameter Selection
%A John, George H.
%D October 1994
%X Machine learning algorithms for supervised learning are in
               wide use. An important issue in the use of these algorithms
               is how to set the parameters of the algorithm. While the
               default parameter values may be appropriate for a wide
               variety of tasks, they are not necessarily optimal for a
               given task. In this paper, we investigate the use of
               cross-validation to select parameters for the C4.5 decision
               tree learning algorithm. Experimental results on five
               datasets show that when cross-validation is applied to
               selecting an important parameter for C4.5, the accuracy of
               the induced trees on independent test sets is generally
               higher than the accuracy when using the default parameter
               value.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/12/CS-TN-94-12.pdf

%R CS-TN-94-13
%Z Tue, 18 Oct 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Formalizing Context (Expanded Notes)
%A McCarthy, John
%A Buvac, Sasa
%D October 1994
%X These notes discuss formalizing contexts as first class
               objects. The basic relation is Ist(c,p). It asserts that the
               proposition p is true in the context c. The most important
               formulas relate the propositions true in different contexts.
               Introducing contexts as formal objects will permit
               axiomatizations in limited contexts to be expanded to
               transcend the original limitations. This seems necessary to
               provide AI programs using logic with certain capabilities
               that human fact representation and human reasoning possess.
               Fully implementing transcendence seems to require further
               extensions to mathematical logic, i.e. beyond the
               nonmonotonic inference methods first invented in AI and now
               studied as a new domain of logic.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/13/CS-TN-94-13.pdf

%R CS-TN-94-14
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Generalized Projections: A Powerful Query-Optimization
               Technique
%A Harinarayan, Venky
%A Gupta, Ashish
%D November 1994
%X In this paper we introduce generalized projections (GP). GPs
               capture aggregations, groupbys, conventional projection with
               duplicate elimination (Distinct), and duplicate preserving
               projections. We develop a technique for pushing GPs down
               query trees of Select-project-join queries that may use
               aggregations like Max, Sum, etc. and that use arbitrary
               functions in their selection conditions. Our technique pushes
               down to the lowest levels of a query tree aggregation
               computation, duplicate elimination, and function computation.
               The technique also creates aggregations in queries that did
               not use aggregation to begin with. Our technique is important
               since applying aggregations early in query processing can
               provide significant performance improvements. In addition to
               their value in query optimization, generalized projections
               unify set and duplicate semantics, and help better understand
               aggregations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/14/CS-TN-94-14.pdf

%R CS-TN-94-15
%Z Thu, 08 Dec 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reasoning Theories: Towards an Architecture for Open
               Mechanized Reasoning Systems
%A Giunchiglia, Fausto
%A Pecchiari, Paolo
%A Talcott, Carolyn
%D December 1994
%X Our ultimate goal is to provide a framework and a methodology
               which will allow users, and not only system developers, to
               construct complex reasoning systems by composing existing
               modules, or to add new modules to existing systems, in a
               ``plug and play'' manner. These modules and systems might be
               based on different logics; have different domain models; use
               different vocabularies and data structures; use different
               reasoning strategies; and have different interaction
               capabilities. This paper makes two main contributions towards
               our goal. First, it proposes a general architecture for a
               class of reasoning modules and systems called Open Mechanized
               Reasoning Systems (OMRSs). An OMRS has three components: a
               reasoning theory component which is the counterpart of the
               logical notion of formal system, a control component which
               consists of a set of inference strategies, and an interaction
               component which provides an OMRS with the capability of
               interacting with other systems, including OMRSs and human
               users. Second, it develops the theory underlying the
               reasoning theory component. This development is motivated by
               an analysis of state of the art systems. The resulting theory
               is then validated by using it to describe the integration of
               the linear arithmetic module into the simplification process
               of the Boyer-Moore system, NQTHM.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/15/CS-TN-94-15.pdf

%R CS-TN-95-16
%Z Mon, 13 Mar 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Meaning of Negative Premises in Transition System
               Specifications II
%A Glabbeek, R.J. van
%D February 1995
%X This paper reviews several methods to associate transition
               relations to transition system specifications with negative
               premises in Plotkin's structural operational style. Besides a
               formal comparison on generality and relative consistency, the
               methods are also evaluated on their taste in determining
               which specifications are meaningful and which are not.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/16/CS-TN-95-16.pdf

%R CS-TN-95-17
%Z Mon, 13 Mar 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Ntyft/ntyxt Rules Reduce to Ntree Rules
%A Glabbeek, R.J. van
%D February 1995
%X Groote and Vaandrager introduced the tyft/tyxt format for
               Transition System Specifications (TSSs), and established that
               for each TSS in this format that is well-founded, the
               bisimulation equivalence it induces is a congruence. In this
               paper, we construct for each TSS in tyft/tyxt format an
               equivalent TSS that consists of tree rules only. As a
               corollary we can give an affirmative answer to an open
               question, namely whether the well-foundedness condition in
               the congruence theorem for tyft/tyxt can be dropped. These
               results extend to tyft/tyxt with negative premises and
               predicates.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/17/CS-TN-95-17.pdf

%R CS-TN-95-18
%Z Mon, 13 Mar 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Effective models of polymorphism, subtyping and recursion
%A Mitchell, John
%A Viswanathan, Ramesh
%D March 1995
%X We develop a class of models of polymorphism, subtyping and
               recursion based on a combination of traditional recursion
               theory and simple domain theory. A significant property of
               our primary model is that types are coded by natural numbers
               using any index of their supremum operator. This leads to a
               distinctive view of polymorphic functions that has many of
               the usual parametricity properties. It also gives a
               distinctive but entirely coherent interpretation of
               subtyping. An alternate construction points out some
               peculiarities of computability theory based on natural number
               codings. Specifically, the polymorphic fixed point is
               computable by a single algorithm at all types when we
               construct the model over untyped call-by-value lambda terms,
               but not when we use Godel numbers for computable functions.
               This is consistent with trends away from natural numbers in
               the field of abstract recursion theory.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/18/CS-TN-95-18.pdf

%R CS-TN-95-19
%Z Mon, 20 Mar 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast Approximation Algorithm for Minimum Cost Multicommodity
               Flow
%A Kamath, Anil
%A Palmon, Omri
%A Plotkin, Serge
%D March 1995
%X Minimum-cost multicommodity flow problem is one of the
               classical optimization problems that arises in a variety of
               contexts. Applications range from finding optimal ways to
               route information through communication networks to VLSI
               layout.
               In this paper, we describe an efficient deterministic
               approximation algorithm, which given that there exists a
               multicommodity flow of cost $B$ that satisfies all the
               demands, produces a flow of cost at most $(1+\delta)B$ that
               satisfies $(1-\epsilon)$-fraction of each demand. For
               constant $\delta$ and $\epsilon$, our algorithm runs in
               $O^*(kmn^2)$ time, which is an improvement over the
               previously fastest (deterministic) approximation algorithm
               for this problem due to Plotkin, Shmoys, and Tardos, that
               runs in $O^*(k^2m^2)$ time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/19/CS-TN-95-19.pdf

%R CS-TN-95-20
%Z Mon, 20 Mar 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Interior Point Algorithms for Exact and Approximate Solution
               of Multicommodity Flow Problems
%A Kamath, Anil
%A Palmon, Omri
%D March 1995
%X In this paper, we present a new interior-point based
               polynomial algorithm for the multicommodity flow problem and
               its variants. Unlike all previously known interior point
               algorithms for multicommodity flow that have the same
               complexity for approximate and exact solutions, our algorithm
               improves running time in the approximate case by a polynomial
               factor. For many cases, the exact bounds are better as well.
               Instead of using the conventional linear programming
               formulation for the multicommodity flow problem, we model it
               as a quadratic optimization problem which is solved using
               interior-point techniques. This formulation allows us to
               exploit the underlying structure of the problem and to solve
               it efficiently.
               The algorithm is also shown to have improved stability
               properties. The improved complexity results extend to minimum
               cost multicommodity flow, concurrent flow and generalized
               flow problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/20/CS-TN-95-20.pdf

%R CS-TN-95-21
%Z Tue, 02 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Generalizing GlOSS to Vector-Space Databases and Broker
               Hierarchies
%A Gravano, Luis
%A Garcia-Molina, Hector
%D April 1995
%X As large numbers of text databases have become available on
               the Internet, it is getting harder to locate the right
               sources for given queries. In this paper we present gGlOSS, a
               generalized Glossary-Of-Servers Server, that keeps statistics
               on the available databases to estimate which databases are
               the potentially most useful for a given query. gGlOSS extends
               our previous work, which focused on databases using the
               boolean model of document retrieval, to cover databases using
               the more sophisticated vector-space retrieval model. We
               evaluate our new techniques using real-user queries and 53
               databases. Finally, we further generalize our approach by
               showing how to build a hierarchy of gGlOSS brokers. The top
               level of the hierarchy is so small it could be widely
               replicated, even at end-user workstations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/21/CS-TN-95-21.pdf

%R CS-TN-95-22
%Z Wed, 09 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Combining Register Allocation and Instruction Scheduling
%A Motwani, Rajeev
%A Palem, Krishna V.
%A Sarkar, Vivek
%A Reyen, Salem
%D August 1995
%X We formulate combined register allocation and instruction
               scheduling within a basic block as a single optimization
               problem, with an objective cost function that more directly
               captures the primary measure of interest in code optimization
               --- the completion time of the last instruction. We show that
               although a simple instance of the combined problem is
               NP-hard, the combined problem is much easier to solve
               approximately than graph coloring, which is a common
               formulation used for the register allocation phase in
               phase-ordered solutions.
               Using our framework, we devise a simple and effective
               heuristic algorithm for the combined problem. This algorithm
               is called the (alpha,beta)-Combined Heuristic; parameters
               alpha and beta provide relative weightages for controlling
               register pressure and instruction parallelism considerations
               in the combined heuristic. Preliminary experiments indicate
               that the combined heuristic yields improvements in the range
               of 16-21% compared to the phase-ordered solutions, when the
               input graphs contain balanced amount of register pressure and
               instruction-level parallelism.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/22/CS-TN-95-22.pdf

%R CS-TN-95-23
%Z Thu, 10 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Dynamic Maintenance of Kinematic Structures
%A Halperin, Dan
%A Latombe, Jean-Claude
%A Motwani, Rajeev
%D August 1995
%X We study the following dynamic data structure problem. Given
               a collection of rigid bodies moving in three-dimensional
               space and hinged together in a kinematic structure, our goal
               is to maintain a data structure that describes certain
               geometric features of these bodies, and efficiently update it
               as the bodies move. This data structure problem seems to be
               fundamental and it comes up in a variety of applications such
               as conformational search in molecular biology, simulation of
               hyper-redundant robots, collision detection and computer
               animation. In this note we present preliminary results on a
               few variants of the problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/23/CS-TN-95-23.pdf

%R CS-TN-95-24
%Z Mon, 09 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Approximation Algorithms for $k$-Delivery TSP
%A Chalasani, Prasad
%A Motwani, Rajeev
%D August 1995
%X We provide O(1) approximation algorithms for the following
               NP-hard problem called k-Delivery TSP: We have at our
               disposal a truck of capacity k, and there are n depots and n
               customers at various locations in some metric space, and
               exactly one item (all of which are identical) at each depot.
               We want to find an optimal tour using the truck to deliver
               one item to each customer. Our algorithms run in time
               polynomial in both n and k. The 1-Delivery problem is one of
               finding an optimal tour that alternately visits depots and
               customers. For this case we use matroid intersection to show
               a polynomial-time 2-approximation algorithm, improving upon a
               factor 2.5 algorithm of Anily and Hassin. Using this
               approximation combined with certain lower bounding arguments
               we show a factor 11.5 approximation to the optimal k-Delivery
               tour. For the infinite k case we show a factor 2
               approximation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/24/CS-TN-95-24.pdf

%R CS-TN-95-25
%Z Thu, 05 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Complexity Measures for Assembly Sequences
%A Goldwasser, Michael
%A Latombe, Jean-Claude
%A Motwani, Rajeev
%D October 1995
%X Our work examines various complexity measures for two-handed
               assembly sequences. Many present assembly sequencers take a
               description of a product and output a valid assembly
               sequence. For many products there exists an exponentially
               large set of valid sequences, and a natural goal is to use
               automated systems to attempt to select wisely from the
               choices. Since assembly sequencing is a preprocessing phase
               for a long and expensive manufacturing process, any work
               towards finding a ``better'' assembly plan is of great value
               when it comes time to assemble the physical product in mass
               quantities.
               We take a step in this direction by introducing a formal
               framework for studying the optimization of several complexity
               measures. This framework focuses on the combinatorial aspect
               of the family of valid assembly sequences, while temporarily
               separating out the specific geometric assumptions inherent to
               the problem. With an exponential number of possibilities,
               finding the true optimal cost solution seems hard. In fact in
               the most general case, our results suggest that even finding
               an approximate solution is hard. Future work is directed
               towards using this model to study how the original geometric
               assumptions can be reintroduced to prove stronger
               approximation results.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/25/CS-TN-95-25.pdf

%R CS-TN-95-26
%Z Thu, 05 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Mediation and Software Maintenance
%A Wiederhold, Gio
%D October 1995
%X This paper reports on recent work and directions in modern
               software architectures and their formal models with respect
               to software maintenance. Related earlier work, now entering
               practice, provides automatic creation of object structures
               for customer applications using such models and their
               algebra, and we will summarize that work. Our focus on
               maintenance intends to attack the most costly and frustrating
               aspect in dealing with large-scale software systems: keeping
               them up-to-date and responsive to user needs in changing
               environments.
               We introduce the concept of domain-specific mediators to
               partition the maintenance effort. Mediators are autonomous
               modules which create information objects out of source data.
               These modules are placed into an intermediate layer, bridging
               clients and servers. These mediators contain knowledge
               required to establish and maintain services in a coherent
               domain. A mediated architecture can reduce the cost growth of
               maintenance to a near-linear function of system size, whereas
               current system architectures have quadratic factors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/26/CS-TN-95-26.pdf

%R CS-TN-95-27
%Z Fri, 27 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Comparing Very Large Database Snapshots
%A Labio, Wilburt Juan
%A Garcia-Molina, Hector
%D May 1995
%X Detecting and extracting modifications from information
               sources is an integral part of data warehousing. For
               unsophisticated sources, in practice it is often necessary to
               infer modifications by periodically comparing snapshots of
               data from the source. Although this snapshot differential
               problem is closely related to traditional joins and
               outerjoins, there are significant differences, which lead to
               simple new algorithms. In particular, we present algorithms
               that perform (possibly lossy) compression of records. We also
               present a window algorithm that works very well if the
               snapshots are not "very different". The algorithms are
               studied via analysis and an implementation of two of them;
               the results illustrate the potential gains achievable with
               the new algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/95/27/CS-TN-95-27.pdf

%R CS-TN-96-28
%Z Tue, 09 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Common Framework for Steerability, Motion Estimation and
               Invariant Feature Detection
%A Hel-Or, Yacov
%A Teo, Patrick C.
%D January 1996
%X Many problems in computer vision and pattern recognition
               involve groups of transformations. In particular, motion
               estimation, steerable filter design and invariant feature
               detection are often formulated with respect to a particular
               transformation group. Traditionally, these problems have been
               investigated independently. From a theoretical point of view,
               however, the issues they address are similar. In this paper,
               we examine these common issues and propose a theoretical
               framework within which they can be discussed in concert. This
               framework is based on constructing a more natural
               representation of the image for a given transformation group.
               Within this framework, many existing techniques of motion
               estimation, steerable filter design and invariant feature
               detection appear as special cases. Furthermore, several new
               results are direct consequences of this framework. First, a
               canonical decomposition of all filters that can be steered
               with respect to any one-parameter group and any
               multi-parameter Abelian group is proposed. Filters steerable
               under various subgroups of the affine group are also
               tabulated. Second, two approximation techniques are suggested
               to deal with filters that cannot be steered exactly.
               Approximating steerable filters can also be used for motion
               estimation. Third, within this framework, invariant features
               can easily be constructed using traditional techniques for
               computing point invariance.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/28/CS-TN-96-28.pdf

%R CS-TN-96-30
%Z Tue, 16 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Development of Type Systems for Object-Oriented Languages
%A Fisher, Kathleen
%A Mitchell, John C.
%D January 1996
%X This paper, which is partly tutorial in nature, summarizes
               some basic research goals in the study and development of
               typed object-oriented programming languages. These include
               both immediate repairs to problems with existing languages
               and the long-term development of more flexible and
               expressive, yet type-safe, approaches to program organization
               and design. The technical part of the paper is a summary and
               comparison of three object models from the literature. We
               conclude by discussing approaches to selected research
               problems, including changes in the type of a method from
               super class to sub class and the use of types that give
               information about the implementations as well as the
               interfaces of objects. Such implementation types seem
               essential for adequate typing of binary operations on
               objects, for example.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/30/CS-TN-96-30.pdf

%R CS-TN-96-31
%Z Tue, 16 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Classes = Objects + Data Abstraction
%A Fisher, Kathleen
%A Mitchell, John C.
%D January 1996
%X We describe a type-theoretic foundation for object systems
               that include ``interface types'' and ``implementation
               types.'' Our approach begins with a basic object calculus
               that provides a notion of object, method lookup, and object
               extension (an object-based form of inheritance). In this
               calculus, the type of an object gives its interface, as a set
               of methods and their types, but does not imply any
               implementation properties. We extend this object calculus
               with a higher-order form of data abstraction that allows us
               to declare supertypes of an abstract type and a list of
               methods guaranteed not to be present. This results in a
               flexible framework for studying and improving practical
               programming languages where the type of an object gives
               certain implementation guarantees, such as would be needed to
               statically determine the offset of a method or safely
               implement binary operations without exposing the internal
               representation of objects. We prove type soundness for the
               entire language using operational semantics and an analysis
               of typing derivations. One insight that is an immediate
               consequences of our analysis is a principled, type-theoretic
               explanation (for the first time, as far as we know) of the
               link between subtyping and inheritance in C++, Eiffel and
               related languages.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/31/CS-TN-96-31.pdf

%R CS-TN-96-29
%Z Tue, 16 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Optimization Complexity of Constraint Satisfaction
               Problems
%A Khanna, Sanjeev
%A Sudan, Madhu
%D December 1995
%X In 1978, Schaefer considered a subclass of languages in NP
               and proved a ``dichotomy theorem'' for this class. The
               subclass considered were problems expressible as ``constraint
               satisfaction problems'', and the ``dichotomy theorem'' showed
               that every language in this class is either in P, or is
               NP-hard. This result is in sharp contrast to a result of
               Ladner, which shows that such a dichotomy does not hold for
               NP, unless NP=P.
               We consider optimization version of the dichotomy question
               and show an analog of Schaefer's result for this case. More
               specifically, we consider optimization version of
               ``constraint satisfaction problems'' and show that every
               optimization problem in this class is either solvable exactly
               in P, or is MAX SNP-hard, and hence not approximable to
               within some constant factor in polynomial time, unless NP=P.
               This result does not follow directly from Schaefer's result.
               In particular, the set of problems that turn out to be hard
               in this case, is quite different from the set of languages
               which are shown hard by Schaefer's result.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/29/CS-TN-96-29.pdf

%R CS-TN-96-32
%Z Mon, 04 Mar 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Design of Multi-Parameter Steerable Functions Using Cascade
               Basis Reduction
%A Teo, Patrick C.
%A Hel-Or, Yacov
%D March 1996
%X A new cascade basis reduction method of computing the optimal
               least-squares set of basis functions steering a given
               function is presented. The method combines the Lie
               group-theoretic and the singular value decomposition
               approaches in such a way that their respective strengths
               complement each other. Since the Lie group-theoretic approach
               is used, the set of basis and steering functions computed can
               be expressed analytically. Because the singular value
               decomposition method is used, this set of basis and steering
               functions is optimal in the least-squares sense. Furthermore,
               the computational complexity in designing basis functions for
               transformation groups with large numbers of parameters is
               significantly reduced. The efficiency of the cascade basis
               reduction method is demonstrated by designing a set of basis
               functions that steers a Gabor function under the
               four-parameter linear transformation group.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/32/CS-TN-96-32.pdf

%R CS-TN-96-34
%Z Wed, 15 May 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast Estimation of Diameter and Shortest Paths (without
               Matrix Multiplication)
%A Aingworth, Donald
%A Chekuri, Chandra
%A Indyk, Piotr
%A Motwani, Rajeev
%D May 1996
%X In the recent past, there has been considerable progress in
               devising algorithms for the all-pairs shortest paths problem
               running in time significantly smaller than the obvious time
               bound of O(n^3). Unfortunately, all the new algorithms are
               based on fast matrix multiplication algorithms that are
               notoriously impractical. Our work is motivated by the goal of
               devising purely combinatorial algorithms that match these
               improved running times. Our results come close to achieving
               this goal, in that we present algorithms with a small
               additive error in the length of the paths obtained. Our
               algorithms are easy to implement, have the desired property
               of being combinatorial in nature, and the hidden constants in
               the running time bound are fairly small.
               Our main result is an algorithm which solves the all-pairs
               shortest paths problem in unweighted, undirected graphs with
               an additive error of 2 in time O(n^{2.5} sqrt{log n}). This
               algorithm returns actual paths and not just the distances. In
               addition, we give more efficient algorithms with running time
               O(n^{1.5} sqrt{k log n} + n^2 log^2 n) for the case where we
               are only required to determine shortest paths between k
               specified pairs of vertices rather than all pairs of
               vertices. The starting point for all our results is an O(m
               sqrt{n log n}) algorithm for distinguishing between graphs of
               diameter 2 and 4, and this is later extended to obtaining a
               ratio 2/3 approximation to the diameter in time O(m sqrt{n
               log n} + n^2 log n). Unlike in the case of all-pairs shortest
               paths, our results for approximate diameter computation can
               be extended to the case of directed graphs with arbitrary
               positive real weights on the edges.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/34/CS-TN-96-34.pdf

%R CS-TN-96-33
%Z Mon, 15 Apr 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Computational Group-Theoretic Approach to Steerable
               Functions
%A Teo, Patrick C.
%A Hel-Or, Yacov
%D April 1996
%X We present a computational, group-theoretic approach to
               steerable functions. The approach is group-theoretic in that
               the treatment involves continuous transformation groups for
               which elementary Lie group theory may be applied. The
               approach is computational in that the theory is constructive
               and leads directly to a procedural implementation. For
               functions that are steerable with $n$ basis functions under a
               $k$-parameter group, the procedure is efficient in that at
               most $nk+1$ iterations of the procedure are needed to compute
               all the basis functions. Furthermore, the procedure is
               guaranteed to return the minimum number of basis functions.
               If the function is not steerable, a numerical implementation
               of the procedure could be used to compute basis functions
               that approximately steer the function over a range of
               parameters. Examples of both applications are described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/33/CS-TN-96-33.pdf

%R CS-TN-96-35
%Z Mon, 10 Jun 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Calculus for Concurrent Objects
%A DiBlasio, Paolo
%A Fisher, Kathleen
%D June 1996
%X This paper presents an imperative and concurrent extension of
               the functional object-oriented calculus described in [FHM94].
               It belongs to the family of so-called prototype-based
               object-oriented languages, in which objects are created from
               existing ones via the inheritance primitives of object
               extension and method override. Concurrency is introduced
               through the identification of objects and processes. To our
               knowledge, the resulting calculus is the first concurrent
               object calculus to be studied. We define an operational
               semantics for the calculus via a transition relation between
               configurations, which represent snapshots of the run-time
               system. Our static analysis includes a type inference system,
               which statically detects message-not-understood errors, and
               an effect system, which guarantees that synchronization code,
               specified via guards, is side-effect free. We present a
               subject reduction theorem, modified to account for imperative
               and concurrent features, and type and effect soundness
               theorems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/35/CS-TN-96-35.pdf

%R CS-TN-96-36
%Z Thu, 13 Jun 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Efficient Snapshot Differential Algorithms for Data
               Warehousing
%A Garcia-Molina, Hector
%A Labio, Wilburt Juan
%D June 1996
%X Detecting and extracting modifications from information
               sources is an integral part of data warehousing. For
               unsophisticated sources, in practice it is often necessary to
               infer modifications by periodically comparing snapshots of
               data from the source. Although this em snapshot differential
               problem is closely related to traditional joins and
               outerjoins, there are significant differences, which lead to
               simple new algorithms. In particular, we present algorithms
               that perform (possibly lossy) compression of records. We also
               present a {\em window} algorithm that works very well if the
               snapshots are not ``very different.'' The algorithms are
               studied via analysis and an implementation of two of them;
               the results illustrate the potential gains achievable with
               the new algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/36/CS-TN-96-36.pdf

%R CS-TN-96-37
%Z Wed, 09 Oct 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Improved Lower Bound for Load Balancing of Tasks with
               Unknown Duration
%A Plotkin, Serge
%A Ma, Yuan
%D October 1996
%X Suppose there are n servers and a sequence of tasks, each of
               which arrives in an on-line fashion and can be handled by a
               subset of the servers. The level of the service required by a
               task is known upon arrival, but the duration of the service
               is unknown. The on-line load balancing problem is to assign
               each task to an appropriate server so that the maximum load
               on the servers is minimized. The best known lower bound on
               the competitive ratio for this problem was Sqrt(n). However,
               the argument used to prove this lower bound used a sequence
               of tasks with exponential duration, and therefore this lower
               bound does not preclude an algorithm with a competitive ratio
               that is polylogarithmic in T, the maximum task duration. In
               this paper we prove a lower bound of sqrt(T), thereby proving
               that a competitive ratio that is polylogarithmic in T is
               impossible. This should be compared to the analogous case for
               known-duration tasks, where it is possible to achieve
               competitive ratio that is logarithmic in T.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/37/CS-TN-96-37.pdf

%R CS-TN-96-38
%Z Tue, 17 Dec 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Intractability of Assembly Sequencing: Unit Disks in the
               Plane
%A Goldwasser, Michael
%A Motwani, Rajeev
%D December 1996
%X We consider the problem of removing a given disk from a
               collection of unit disks in the plane. At each step, we allow
               a disk to be removed by a collision-free translation to
               infinity, and the goal is to access a given disk using as few
               steps as possible. Recently there has been a focus on
               optimizing assembly sequences over various cost measures,
               however with very limited algorithmic success. We explain
               this lack of success, proving strong inapproximability
               results in this simple geometric setting. These
               inapproximability results, to the best of our knowledge, are
               the strongest hardness results known for any purely
               combinatorial problem in a geometric setting.
               As a stepping stone, we study the approximability of
               scheduling with AND/OR precedence constraints. The Disks
               problem can be formulated as a scheduling problem where the
               order of removals is to be scheduled. Before scheduling a
               disk to be removed, a path must be cleared, and so we get
               precedence constraints on the tasks; however, the form of
               such constraints differs from traditional scheduling in that
               there is a choice of which path to clear.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/38/CS-TN-96-38.pdf

%R CS-TN-96-39
%Z Wed, 18 Dec 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Complexity Measures for Assembly Sequences
%A Goldwasser, Michael
%A Motwani, Rajeev
%D December 1996
%X Our work examines various complexity measures for two-handed
               assembly sequences. Although there has been a great deal of
               algorithmic success for finding feasible assembly sequences,
               there has been very little success towards optimizing the
               costs of sequences. We attempt to explain this lack of
               progress, by proving the inherent difficulty in finding
               optimal, or even near-optimal, assembly sequences.
               We begin by introducing a formal framework for studying the
               optimization of several complexity measures. We consider a
               variety of different settings and natural cost measures for
               assembly sequences. Following which, we define a
               graph-theoretic problem which is a generalization of assembly
               sequencing. For our virtual assembly sequencing problem
               we are able to use techniques common to the theory of
               approximability to prove the hardness of finding even
               near-optimal sequences for most cost measures in our
               generalized framework.
               Of course, hardness results in our generalized framework do
               not immediately carry over to the original geometric
               problems. We continue by realizing several of these hardness
               results in rather simple geometric settings, proving the
               difficulty of some of the original problems. These
               inapproximability results, to the best of our knowledge, are
               the strongest hardness results known for a purely
               combinatorial problem in a geometric setting.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/96/39/CS-TN-96-39.pdf

%R CS-TN-97-40
%Z Tue, 28 Jan 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Content Ratings, and Other Third-Party Value-Added
               Information: Defining an Enabling Platform
%A Roscheisen, Martin
%A Winograd, Terry
%A Paepcke, Andreas
%D January 1997
%X This paper describes the ComMentor annotation architecture
               and its usages, with a specific emphasis on the content
               rating application.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/40/CS-TN-97-40.pdf

%R CS-TN-97-41
%Z Mon, 03 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reducing Initial Latency in a Multimedia Storage System
%A Chang, Edward
%A Garcia-Molina, Hector
%D February 1997
%X A multimedia server delivers presentations (e.g., videos,
               movies, games), providing high bandwidth and continuous
               real-time delivery. In this paper we present techniques for
               reducing the initial latency of presentations, i.e., for
               reducing the time between the arrival of a request and the
               start of the presentation. Traditionally, initial latency has
               not received much attention. This is because one major
               application of multimedia servers is ``movies on demand''
               where a delay of a few minutes before a new multi-hour movie
               starts is acceptable. However, latency reduction is important
               in interactive applications such as playing of video games
               and browsing of multimedia documents. Latency reduction is
               also crucial to improve access performance to media data in a
               multimedia database system. Various latency reduction schemes
               are proposed and analyzed, and their performance compared. We
               show that our techniques can significantly reduce (almost
               eliminate in some cases) initial latency without adversely
               affecting throughput. Moreover, a novel on-disk partial data
               replication scheme that we propose proves to be far more cost
               effective than any other previous attempts in reducing
               initial latency.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/41/CS-TN-97-41.pdf

%R CS-TN-97-42
%Z Mon, 03 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T From User Access Patterns to Dynamic Hypertext Linking
%A Yan, Tak Woon
%A Jacobsen, Matthew
%A Garcia-Molina, Hector
%A Dayal, Umeshwar
%D February 1997
%X This paper describes an approach for automatically
               classifying visitors of a web site according to their access
               patterns. User access logs are examined to discover clusters
               of users that exhibit similar information needs; e.g., users
               that access similar pages. This may result in a better
               understanding of how users visit the site, and lead to an
               improved organization of the hypertext documents for
               navigational convenience. More interestingly, based on what
               categories an individual user falls into, we can dynamically
               suggest links for him to navigate. In this paper, we describe
               the overall design of a system that implements these ideas,
               and elaborate on the preprocessing, clustering, and dynamic
               link suggestion tasks. We present some experimental results
               generated by analyzing the access log of a web site.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/42/CS-TN-97-42.pdf

%R CS-TN-97-43
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Report on the May 18-19 1995 IITA Digital Libraries Workshop:
               Final Draft for Participant Review, August 4, 1995
%A Lynch, Clifford
%A Garcia-Molina, Hector
%D February 1997
%X This report summarizes the outcomes of a workshop on Digital
               Libraries held under the auspices of the US Government's
               Information Infrastructure Technology and Applications (IITA)
               Working Group in Reston, Virginia on May 18-19, 1995. The
               objective of the workshop was to refine the research agenda
               for digital libraries with specific emphasis on scaling and
               interoperability and the infrastructure needed to enable this
               research agenda.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/43/CS-TN-97-43.pdf

%R CS-TN-97-44
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Addressing Heterogeneity in the Networked Information
               Environment
%A Baldonado, Michelle Q. Wang
%A Cousins, Steve B.
%D February 1997
%X Several ongoing Stanford University Digital Library projects
               address the issue of heterogeneity in networked information
               environments. A networked information environment has the
               following components: users, information repositories,
               information services, and payment mechanisms. This paper
               describes three of the heterogeneity-focused Stanford
               projects-InfoBus, REACH, and DLITE. The InfoBus project is at
               the protocol level, while the REACH and DLITE projects are
               both at the conceptual model level. The InfoBus project
               provides the infrastructure necessary for accessing
               heterogeneous services and utilizing heterogeneous payment
               mechanisms. The REACH project sets forth a uniform conceptual
               model for finding information in networked information
               repositories. The DLITE project presents a general task-based
               strategy for building user interfaces to heterogeneous
               networked information services.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/44/CS-TN-97-44.pdf

%R CS-TN-97-45
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Information Needs in Technical Work Settings and Their
               Implications for the Design of Computer Tools
%A Paepcke, Andreas
%D February 1997
%X We interviewed information workers in multiple technical
               areas of a large, diverse company, and we describe some of
               the unsatisfied information needs we observed during our
               study. Two clusters of issues are described. The first covers
               how loosely coupled work groups use and share information. We
               show the need to structure information for multiple, partly
               unanticipated uses. We show how the construction of
               information compounds helps users accomplish some of this
               restructuring, and we explain how structuring flexibility is
               also required because of temperamental differences among
               users. The second cluster of issues revolves around
               collections of tightly coupled work groups. We show that
               information shared within such groups differs from
               information shared across group boundaries. We present the
               barriers to sharing which we saw operating both within groups
               and outside, and we explain the function of resource and
               contact broker which evolved in the settings we examined. For
               each of these issues we propose implications for information
               tool design.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/45/CS-TN-97-45.pdf

%R CS-TN-97-46
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Proposal for Basing our Protocols on a General Information
               Exchange Level
%A Winograd, Terry
%D February 1997
%X In order to build our protocols in a way that will provide
               for long term growth and extensibility, we should define a
               standard level on top of the CORBA/ILU level, to deal with
               the management of information across objects on different
               servers. It is based on a generalization of the protocols we
               have been working on.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/46/CS-TN-97-46.pdf

%R CS-TN-97-47
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Modes of Information Integration
%A Winograd, Terry
%D February 1997
%X In order to better understand the different approaches to
               digital libraries, I compared a number of existing and
               proposed systems and developed a taxonomy that can be used in
               identifying the different tradeoffs they make in the overall
               design space.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/47/CS-TN-97-47.pdf

%R CS-TN-97-48
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Conceptual Models for Comparison of Digital Library Systems
               and Approaches
%A Winograd, Terry
%D February 1997
%X This document is a working paper that grew out of the
               discussions at the Digital Libraries joint projects meeting
               in Washington on Nov. 8-9, 1994. It is intended as a first
               rough cut at a conceptual framework for understanding the
               significant differences among systems and ideas, so that we
               can better decide where to work for interoperability and
               where to take complementary approaches.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/48/CS-TN-97-48.pdf

%R CS-TN-97-49
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Why You Won't Be Buying and Selling Information Yourself
%A Winograd, Terry
%D February 1997
%X A large part of the economics of electronic publishing of
               library materials will be based on site licensing, not on
               per-use fees.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/49/CS-TN-97-49.pdf

%R CS-TN-97-50
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Lightweight Objects for the Digital Library
%A Winograd, Terry
%D February 1997
%X We have been looking at the potentials for integrating
               Xerox's GAIA system into the INFObus architecture. Ramana
               suggested we look at Xerox/Novell's Document Enhanced
               Networking (DEN) specification which incorporates some of the
               GAIA ideas in a product. I went through the spec, and had
               some realizations about what we are trying to do with the
               INFObus that I thought would be generally useful. The latter
               part of this message is a proposal for our own architecture.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/50/CS-TN-97-50.pdf

%R CS-TN-97-51
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Proxy Is Where It's At!
%A Winograd, Terry
%D February 1997
%X Soon, more than 90% of access to the Internet will be through
               commercial proxies.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/51/CS-TN-97-51.pdf

%R CS-TN-97-52
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Adaptive Agent for Automated Web Browsing
%A Balabanovic, Marko
%A Shoham, Yoav
%A Yun, Yeogirl
%D February 1997
%X The current exponential growth of the Internet precipitates a
               need for new tools to help people cope with the volume of
               information. To complement recent work on creating searchable
               indexes of the World-Wide Web and systems for filtering
               incoming e-mail and Usenet news articles, we describe a
               system which learns to browse the Internet on behalf of a
               user. Every day it presents a selection of interesting Web
               pages. The user evaluates each page, and given this feedback
               the system adapts and attempts to produce better pages the
               following day. After demonstrating that our system is able to
               learn a model of a user with a single well-defined interest,
               we present an initial experiment where over the course of 24
               days the output of our system was compared to both
               randomly-selected and human-selected pages. It consistently
               performed better than the random pages, and was better than
               the human-selected pages half of the time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/52/CS-TN-97-52.pdf

%R CS-TN-97-53
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Communication Agreement Framework of Access Control
%A Roscheisen, Martin
%A Winograd, Terry
%D February 1997
%X We introduce a framework of access control which shifts the
               emphasis from the participants to their relationship. The
               framework is based on a communication model in which
               participants negotiate the mutually agreed-upon boundary
               conditions of their relationship in compact "communication
               pacts," called "commpacts." Commpacts can be seen as a third
               fundamental type next to access-control lists (ACLs) and
               capabilities. We argue that in current networked environments
               characterized by multiple authorities and "trusted proxies,"
               this model provides an encapsulation for interdependent
               authorization policies, which reduces the negotiation
               complexity of general (user- and content-dependent)
               distributed access control and provides a clear
               user-conceptual metaphor; it also generalizes work in
               electronic contracting and embeds naturally into the existing
               legal and institutional infrastructure. The framework is
               intended to provide a language enabling a social mechanism of
               coordinated expectation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/53/CS-TN-97-53.pdf

%R CS-TN-97-54
%Z Wed, 05 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Combining CORBA and the World-Wide Web in the Stanford
               Digital Library Project
%A Paepcke, Andreas
%A Hassan, Scott
%D February 1997
%X Describes in 1.5 pages how SIDL combines CORBA and the WWW.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/54/CS-TN-97-54.pdf

%R CS-TN-97-55
%Z Mon, 24 Mar 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Performance Evaluation of Centralized and Distributed Index
               Schemes for a Page Server OODBMS
%A Basu, Julie
%A Keller, Arthur M.
%A Poess, Meikel
%D March 1997
%X Recent work on client-server data-shipping OODBs has
               demonstrated the usefulness of local data caching at client
               sites. However, none of the studies has investigated
               index-related performance issues in particular. References to
               index pages arise from associative queries and from updates
               on indexed attributes, often making indexes the most heavily
               used "hot spots" in a database. System performance is
               therefore quite sensitive to the index management scheme.
               This paper examines the effects of index caching, and
               investigates two schemes, one centralized and the other
               distributed, for index page management in a page server OODB.
               In the centralized scheme, index pages are not allowed to be
               cached at client sites; thus, communication with the central
               server is required for all index-based queries and index
               updates. The distributed index management scheme supports
               inter-transaction caching of index pages at client sites, and
               enforces a distributed index consistency control protocol
               similar to that of data pages. We study via simulation the
               performance of these two index management schemes under
               several different workloads and contention profiles, and
               identify scenarios where each of the two schemes performs
               better than the other.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/55/CS-TN-97-55.pdf

%R CS-TN-97-56
%Z Mon, 24 Mar 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Approximation Algorithms for Directed Steiner Tree Problems
%A Charikar, Moses
%A Chekuri, Chandra
%A Goel, Ashish
%A Guha, Sudipto
%D March 1997
%X We obtain the first non-trivial approximation algorithms for
               the Steiner Tree problem and the Generalized Steiner Tree
               problem in general directed graphs. Essentially no
               approximation algorithms were known for these problems. For
               the Directed Steiner Tree problem, we design a family of
               algorithms which achieve an approximation ratio of
               O(k^\epsilon) in time O(kn^{1/\epsilon}) for any fixed
               (\epsilon > 0), where k is the number of terminals to be
               connected. For the Directed Generalized Steiner Tree Problem,
               we give an algorithm which achieves an approximation ratio of
               O(k^{2/3}\log^{1/3} k), where k is the number of pairs to be
               connected. Related problems including the Group Steiner tree
               problem, the Node Weighted Steiner tree problem and several
               others can be reduced in an approximation preserving fashion
               to the problems we solve, giving the first non-trivial
               approximations to those as well.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/56/CS-TN-97-56.pdf

%R CS-TN-97-57
%Z Tue, 15 Jul 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Axiomatizing Flat Iteration
%A Glabbeek, R.J. van
%D April 1997
%X Flat iteration is a variation on the original binary version
                of the Kleene star operation P*Q, obtained by restricting the
                first argument to be a sum of atomic actions. It generalizes
                prefix iteration, in which the first argument is a single
                action. Complete finite equational axiomatizations are given
                for five notions of bisimulation congruence over basic CCS
                with flat iteration, viz. strong congruence, branching
                congruence, eta-congruence, delay congruence and weak
                congruence. Such axiomatizations were already known for
                prefix iteration and are known not to exist for general
                iteration. The use of flat iteration has two main advantages
                over prefix iteration: 1. The current axiomatizations
                generalize to full CCS, whereas the prefix iteration approach
                does not allow an elimination theorem for an asynchronous
                parallel composition operator. 2. The greater expressiveness
                of flat iteration allows for much shorter completeness
                proofs. In the setting of prefix iteration, the most
                convenient way to obtain the completeness theorems for eta-,
                delay, and weak congruence was by reduction to the
                completeness theorem for branching congruence. In the case of
                weak congruence this turned out to be much simpler than the
                only direct proof found. In the setting of flat iteration on
                the other hand, the completeness theorems for delay and weak
                (but not eta-) congruence can equally well be obtained by
                reduction to the one for strong congruence, without using
                branching congruence as an intermediate step. Moreover, the
                completeness results for prefix iteration can be retrieved
                from those for flat iteration, thus obtaining a second
                indirect approach for proving completeness for delay and weak
                congruence in the setting of prefix iteration.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/57/CS-TN-97-57.pdf

%R CS-TN-97-60
%Z Tue, 07 Oct 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Efficient Linear Re-rendering for Interactive Lighting Design
%A Teo, Patrick C.
%A Simoncelli, Eero P.
%A Heeger, David J.
%D October 1997
%X We present a framework for interactive lighting design based
               on linear re-rendering. The rendering operation is linear
               with respect to light sources, assuming a fixed scene and
               camera geometry. This linearity means that a scene may be
               interactively re-rendered via linear combination of a set of
               basis images, each rendered under a particular basis light.
               We focus on choosing and designing a suitable set of basis
               lights. We provide examples of bases that allow 1)
               interactive adjustment of a spotlight direction, 2)
               interactive adjustment of the position of an area light, and
               3) a combination in which light sources are adjusted in both
               position and direction. We discuss a method for reducing the
               size of the basis using principal components analysis in the
               image domain.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/60/CS-TN-97-60.pdf

%R CS-TN-97-61
%Z Mon, 09 Oct 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Performance Analysis of an Associative Caching Scheme for
               Client-Server Databases
%A Julie Basu, Meikel Poess, Arthur M. Keller
%D September 1997
%X This paper presents a detailed performance study of the
               associative caching scheme proposed in "A Predicate-based
               Caching Scheme for Client-Server Database Architectures," The
               VLDB Journal, Jan 1996. A client cache dynamically loads
               query results in the course of transaction execution, and
               formulates a description of its current contents.
               Predicate-based reasoning is used on the cache description to
               examine and maintain the cache. The benefits of the scheme
               include local evaluation of associative queries, at the cost
               of maintaining the cached query results through update
               notifications >From the server. In this paper, we investigate
               through detailed simulation the behavior of this caching
               scheme for a client-server database under different workloads
               and contention profiles. An optimized version of our basic
               caching scheme is also proposed and studied. We examine both
               read-only and update transactions, with the effect of updates
               on the caching performance as our primary focus. Using an
               extended version of a standard database benchmark, we
               identify scenarios where these caching schemes improve the
               system performance and scalability, as compared to systems
               without client-side caching. Our results demonstrate that
               associative caching can be beneficial even for moderately
               high update activity.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/61/CS-TN-97-61.pdf

%R CS-TN-97-59
%Z Thu, 18 Sep 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stability of Networks and Protocols in the Adversarial Queueing
               Model for Packet Routing
%A Goel, Ashish
%D September 1997
%X The adversarial queueing theory model for packet routing was
               suggested by Borodin et al. We give a complete and simple
               characterization of all networks that are universally stable
               in this model. We show that a specific greedy protocol, SIS
               (Shortest In System), is stable against a large class of
               stochastic adversaries. New applications such as multicast
               packet scheduling and job scheduling with precedence
               constraints xsare suggested for the adversarial model.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/59/CS-TN-97-59.pdf

%R CS-TN-97-58
%Z Mon, 18 Aug 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Precedence Constrained Scheduling to Minimize Weighted 
               Completion Time on a Single Machine.
%A Chekuri, Chandra
%A Motwani, Rajeev
%D August 1997
%X We consider the problem of scheduling a set of jobs on 
               a single machine with the
               objective of mimizing weighted (average) completion time.  
               The problem is NP-hard
               when there are precedence constraints between jobs, [12] 
               and we provide a simple and efficient combinatorial 
               2-approximation algorithm. In contrast to our work,
               earlier approximation altorithms [9] achieving the same
               ratio are based on solving a linear programming relaxation
               of the problem. 
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/97/58/CS-TN-97-58.pdf

%R CS-TN-98-62
%Z Tue, 21 Apr 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Type System for Object Initialization in the Java Bytecode
               Language
%A Freund, Stephen N.
%A Mitchell, John C.
%D April 1998
%X In the standard Java implementation, a Java language program
               is compiled to Java bytecode. This bytecode may be sent
               across the network to another site, where it is then
               interpreted by the Java Virtual Machine. Since bytecode may
               be written by hand, or corrupted during network transmission,
               the Java Virtual Machine contains a bytecode verifier that
               performs a number of consistency checks before code is
               interpreted. As illustrated by previous attacks on the Java
               Virtual Machine, these tests, which include type correctness,
               are critical for system security. In order to analyze
               existing bytecode verifiers and to understand the properties
               that should be verified, we develop a precise specification
               of statically-correct Java bytecode, in the form of a type
               system. Our focus in this paper is a subset of the bytecode
               language dealing with object creation and initialization. For
               this subset, we prove that for every Java bytecode program
               that satisfies our typing constraints, every object is
               initialized before it is used. The type system is easily
               combined with a previous system developed by Stata and Abadi
               for bytecode subroutines. Our analysis of subroutines and
               object initialization reveals a previously unpublished bug in
               the Sun JDK bytecode verifier.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/62/CS-TN-98-62.pdf

%R CS-TN-98-63
%Z Fri, 15 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Collaborative value filtering on the Web
%A Rodriguez-Mula, Gerard
%A Garcia-Molina, Hector
%A Paepcke, Andreas
%D May 1998
%X Today's Internet search engines help users locate information
               based on the textual similarity of a query and potential
               documents. Given the large number of documents available, the
               user often finds too many documents, and even if the textual
               similarity is high, in many cases the matching documents are
               not relevant or of interest. Our goal is to explore other
               ways to decide if documents are "of value" to the user, i.e.,
               to perform what we call "value filtering." In particular, we
               would like to capture access information that may tell
               us-within limits of privacy concerns-which user groups are
               accessing what data, and how frequently. This information can
               then guide users, for example, helping identify information
               that is popular, or that may have helped others before. This
               is a type of collaborative filtering or community-based
               navigation. Access information can either be gathered by the
               servers that provide the information, or by the clients
               themselves. Tracing accesses at servers is simple, but often
               information providers are not willing to share this
               information. We therefore are exploring client-side
               gathering. Companies like Alexa are currently using client
               gathering in the large. We are studying client gathering at a
               much smaller scale, where a small community of users with
               shared interest collectively track their information
               accesses. For this, we have developed a proxy system called
               the Knowledge Sharing System (KSS) that monitors the behavior
               of a community of users. Through this system we hope to: 1.
               Develop mechanisms for sharing browsing expertise among a
               community of users; and 2. Better understand the access
               patterns of a group of people with common interests, and
               develop good schemes for sharing this information.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/63/CS-TN-98-63.pdf

%R CS-TN-98-64
%Z Fri, 15 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Standard Textual Interchange Format for the Object Exchange
               Model (OEM)
%A Goldman, Roy
%A Chawathe, Sudarshan
%A Crespo, Arturo
%A McHugh, Jason
%D May 1998
%X The Object Exchange Model (OEM) serves as the basic data
               model in numerous projects of the Stanford University Dabase
               Group, including Tsimmis, Lore and C. This document first
               defines and explains the model, and then it describes a
               syntax for textually encoding OEM. By adopting this syntax as
               a standard across all of our OEM projects, we hope to
               encourage interoperability and also to provide a consistent
               view of OEM to interested parties outside Stanford.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/64/CS-TN-98-64.pdf

%R CS-TN-98-65
%Z Fri, 15 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Responsive Interaction for a Large Web Application The Meteor
               Shower Architecture in the WebWriter II Editor
%A Crespo, Arturo
%A Chang, Bay-Wei
%A Bier, Eric A.
%D May 1998
%X Traditional server-based web applications allow access to
               server-hosted resources, but often exhibit poor
               responsiveness due to server load and network delays.
               Client-side web applications, on the other hand, provide
               excellent interactivity at the expense of limited access to
               server resources. The WebWriter II Editor, a direct
               manipulation HTML editor that runs in a web browser, uses
               both server-side and client-side processing in order to
               achieve the advantages of both. In particular, this editor
               downloads the document data structure to the browser and
               performs all operations locally. The user interface is based
               on HTML frames and includes individual frames for previewing
               the document and displaying general and specific control
               panels. All editing is done by JavaScript code residing in
               roughly twenty HTML pages that are downloaded into these
               frames as needed. Such a client-server architecture, based on
               frames, client-side data structures, and multiple
               JavaScript-enhanced HTML pages appears promising for a wide
               variety of applications. This paper describes this
               architecture, the Meteor Shower Application Architecture, and
               its use in the WebWriter II Editor.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/65/CS-TN-98-65.pdf

%R CS-TN-98-66
%Z Fri, 15 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Archival Storage for Digital Libraries
%A Crespo, Arturo
%A Garcia-Molina, Hector
%D May 1998
%X We propose an architecture for Digital Library Repositories
               that assures long-term archival storage of digital objects.
               The architecture is formed by a federation of independent but
               collaborating sites, each managing a collection of digital
               objects. The architecture is based on the following key
               components: use of signatures as object handles, no deletions
               of digital objects, functional layering of services, the
               presence of an awareness service in all layers, and use of
               disposable auxiliary structures. Long-term persistence of
               digital objects is achieved by creating replicas at several
               sites.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/66/CS-TN-98-66.pdf

%R CS-TN-98-67
%Z Fri, 15 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A GUI-Based Version of the SenseMaker Interface for
               Information Exploration
%A Baldonado, Michelle Q Wang
%A Winograd, Terry
%D May 1998
%X SenseMaker is an interface for information exploration. The
               original HTML version of the interface relied on tables for
               display and forms for interaction. The new Java version is
               GUI-based. This video illustrates the new SenseMaker
               interface by presenting a hypothetical scenario of a user
               carrying out an information-exploration task.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/67/CS-TN-98-67.pdf

%R CS-TN-98-68
%Z Fri, 15 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Presenting HTML Structure in Audio: User Satisfaction with
               Audio Hypertext
%A James, Frankie
%D May 1998
%X This paper discusses the results of a 2 by 4 mixed design
               experiment testing various ways of presenting HTML structures
               in audio. Four interface styles were tested: (1) one speaker,
               minimal sound effects, (2) one speaker, many sound effects,
               (3) many speakers, minimal sound effects, and (4) many
               speakers, many sound effects. The results obtained were both
               specific to the interfaces used (i.e., that the use of three
               different speakers to present heading levels was confusing)
               and more general (for example, natural sounds are more
               distinguishable and easier to remember than tones). A short
               discussion of typical HTML usage is also presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/68/CS-TN-98-68.pdf

%R CS-TN-98-69
%Z Fri, 15 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Distinguishability vs. Distraction in Audio HTML Interfaces
%A James, Frankie
%D May 1998
%X In this paper, we present the findings and conclusions from a
               user study on audio interfaces. In the experiment we discuss,
               we studied a framework for choosing sounds for audio
               interfaces by comparing a prototype interface against two
               existing audio browsers. Our findings indicate that our
               initial framework, which was described as a separation
               between recognizable and non-recognizable sounds, could be
               better interpreted in the context of the distinguishability
               and distraction level of various types of sounds. We propose
               a new definition of how a sound can be called distracting and
               how to avoid this when creating audio interfaces.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/69/CS-TN-98-69.pdf

%R CS-TN-98-70
%Z Mon, 18 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Merging Ranks from Heterogeneous Internet Sources
%A Garcia-Molina, Hector 
%A Gravano, Luis
%D May 1998
%X Many sources on the Internet and elsewhere rank the objects
               in query results according to how well these objects match
               the original query. For example, a real-estate agent might
               rank the available houses according to how well they match
               the user's preferred location and price. In this environment,
               ``meta-brokers'' usually query multiple autonomous,
               heterogeneous sources that might use varying result- ranking
               strategies. A crucial problem that a meta-broker then faces
               is extracting from the underlying sources the top objects for
               a user query according to the meta-broker's ranking function.
               This problem is challenging because these top objects might
               not be ranked high by the sources where they appear. In this
               paper we discuss strategies for solving this ``meta-ranking''
               problem. In particular, we present a condition that a source
               must satisfy so that a meta-broker can extract the top
               objects for a query from the source without examining its
               entire contents. Not only is this condition necessary but it
               is also sufficient, and we show an efficient algorithm to
               extract the top objects from sources that satisfy the given
               condition.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/70/CS-TN-98-70.pdf

%R CS-TN-98-71
%Z Mon, 18 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stanford Digital Library Interoperability Protocol
%A Hassan, Scott
%A Paepcke, Andreas
%D May 1998
%X Description of Stanford's interoperability protocol for
               interacting with search related proxy objects on the InfoBus.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/71/CS-TN-98-71.pdf

%R CS-TN-98-72
%Z Mon, 18 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Stanford InfoBus and Its Service Layers: Augmenting the
               Internet with Higher-Level Information Management Protocols
%A Roscheisen, Martin
%A Baldonado, Michelle
%A Chang, Kevin
%A Gravano, Luis
%A Ketchpel, Steven
%A Paepcke, Andreas
%D May 1998
%X The Stanford InfoBus is a prototype infrastructure developed
               as part of the Stanford Digital Libraries Project to extend
               the current Internet protocols with a suite of higher-level
               information management protocols. This paper surveys the five
               service layers pro vided by the Stanford InfoBus: protocols
               for managing items and collections (DLIOP), metadata (SMA),
               search (STARTS), payment (UPAI), and rights and obligations
               (FIRM).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/72/CS-TN-98-72.pdf

%R CS-TN-98-73
%Z Mon, 18 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Proposal for I**3 Client Server Protocol
%A Garcia-Molina, Hector
%A Paepcke, Andreas
%D May 1998
%X This document proposes a CORBA-based protocol for submitting
               queries to servers and for obtaining the results. It is a
               subset of the Stanford Digital Library Interoperability
               protocol (DLIOP).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/73/CS-TN-98-73.pdf

%R CS-TN-98-74
%Z Mon, 18 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Predicate Rewriting for Translating Boolean Queries in a
               Heterogeneous Information System
%A Chang, Chen-Chuan K.
%A Garcia-Molina, Hector
%A Paepcke, Andreas
%D May 1998
%X Searching over heterogeneous information sources is difficult
               in part because of the non- uniform query languages. Our
               approach is to allow users to compose Boolean queries in one
               rich front-end language. For each user query and target
               source, we transform the user query into a subsuming query
               that can be supported by the source but that may return extra
               documents. The results are then processed by a filter query
               to yield the correct final results. In this paper we
               introduce the architecture and associated mechanism for query
               translation. In particular, we discuss techniques for
               rewriting predicates in Boolean queries into native subsuming
               forms, which is a basis of translating complex queries. In
               addition, we present experimental results for evaluating the
               cost of post-filtering. We also discuss the drawbacks of this
               approach and cases when it may not be effective. We have
               implemented prototype versions of these mechanisms and
               demonstrated them on heterogeneous Boolean systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/74/CS-TN-98-74.pdf

%R CS-TN-98-75
%Z Mon, 18 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Extensible Constructor Tool for the Rapid, Interactive
               Design of Query Synthesizers
%A Baldonado, Michelle
%A Katz, Seth
%A Paepcke, Andreas
%A Chang, Chen-Chuan K.
%A Garcia-Molina, Hector
%A Winograd, Terry
%D May 1998
%X We describe an extensible constructor tool that helps
               information experts (e.g., librarians) create specialized
               query synthesizers for heterogeneous digital-library
               environments. A query synthesizer provides a graphical user
               interface in which a digital-library patron can specify a
               high-level, fielded, multi-source query. Furthermore, a query
               synthesizer interacts with a query translator and an
               attribute translator to transform high-level queries into
               sets of source-specific queries. We discuss how the
               constructor can facilitate discovery of available attributes
               (e.g., title), collation of schemas from different sources,
               selection of input widgets for a synthesizer (e.g., a text
               box or a drop-down list widget to support input of controlled
               vocabulary),, and other design aspects. We also describe a
               prototype constructor we implemented, based on the Stanford
               InfoBus and metadata architecture.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/75/CS-TN-98-75.pdf

%R CS-TN-98-76
%Z Mon, 18 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Interoperability for Digital Libraries Worldwide
%A Paepcke, Andreas
%A Chang, Chen-Chuan K.
%A Garcia-Molina, Hector
%A Winograd, Terry
%D May 1998
%X Discusses the history and current directions of
               interoperability in different parts of computing systems
               relevant to Digital Libraries
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/76/CS-TN-98-76.pdf

%R CS-TN-98-78
%Z Tue, 07 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Probabilistic Poly-time Framework for Protocol Analysis
%A Lincoln, P.
%A Mitchell, J.
%A Mitchell, M.
%A Scedrov, A.
%D April 3, 1998
%X We develop a framework for analyzing security protocols
in which protocol adversaries may be arbitrary probabilistic polynomial-time 
processes.  In this framework, protocols are written in a restricted form
of pi-calculus and security may expressed as a form or observational 
equivalence, a standard relation from programming language theory that involves 
quantifying over possible environments that might interact with the protocol.
Using an asymptotic notion of probabilistic equivalence, we relate
observational equivalence to polynomial-time statistical tests and discuss
some example protocols to illustrate the potential strengths of our approach. 
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/78/CS-TN-98-78.pdf

%R CS-TN-98-77
%Z Tue, 07 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Linguistic Characterization of Bounded Oracle Computation and 
Probabilistic Polynomial Time
%A Mitchell, J.
%A Mitchell, M.
%A Scedrov, A.
%D May 4, 1998
%X We present a higher-order functional notation for polynomial-time
computation with arbitrary 0,1-valued oracle.  This provides a linguistic 
characterization for classes such as NP and BPP, as well as a notation for 
probabilistic polynomial-time functions.  The language is derived from
Hofmann's adaptation of Bellantoni-Cook safe recursion, extended to oracle
computation via work derived from that of Kapron and Cook.  Like Hofmann's
language, ours is an applied version of typed lambda calculus with
complexity bounds enforced by a type system.  The type system uses a modal
operator to distinguish between two types of numerical expressions, only
one of which is allowed in recursion indices.  The proof that the language
captures precisely oracle polynomial time is model-theoretic, using
adaptations of various techniques from category theory. 
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/77/CS-TN-98-77.pdf

%R CS-TN-98-79
%Z Wed, 22 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T 2D BubbleUp: Managing Parallel Disks for Media Servers
%A Chang, Edward
%A Garcia-Molina, Hector
%A Li, Chen
%D July 1998
%X In this study we present a scheme called two-dimensional
               BubbleUp (2DB) for managing parallel disks in a multimedia
               server. Its goal is to reduce initial latency for interactive
               multimedia applications, while balancing disk loads to
               maintain high throughput. The 2DB scheme consists of a data
               placement and a request scheduling policy. The data placement
               policy replicates frequently accessed data and places them
               cyclically throughout the disks. The request scheduling
               policy attempts to maintain free ``service slots'' in the
               immediate future. These slots can then be used to quickly
               service newly arrived requests. Through examples and
               simulation, we show that our scheme significantly reduces
               initial latency and maintains throughput comparable to that
               of the traditional schemes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/79/CS-TN-98-79.pdf

%R CS-TN-98-80
%Z Wed, 22 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MEDIC: A Memory & Disk Cache for Multimedia Clients
%A Chang, Edward
%A Garcia-Molina, Hector
%D July 1998
%X In this paper we propose an integrated memory and disk cache
               for a multimedia client. The cache cushions the multimedia
               decoder from input rate fluctuations and mismatches, and
               because data can be cached to disk, the acceptable
               fluctuations can be very large. This gives the media server
               much greater flexibility for load balancing, and lets the
               client operate efficiently when the network rate is much
               larger or smaller than the media display rate. We analyze the
               memory requirements for this cache, and analytically derive
               safe values for its control parameters. Using a realistic
               case study, we study the interaction between memory size,
               peak input rate, and disk performance, and show that a
               relatively modest amount of main memory can support a wide
               range of scenarios.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/80/CS-TN-98-80.pdf

%R CS-TN-98-81
%Z Thu, 23 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T How to build a DLITE component
%A Cousins, Steve B.
%D July 1998
%X This paper describes how to build a DLITE component.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/81/CS-TN-98-81.pdf

%R CS-TN-98-82
%Z Thu, 23 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Minimizing Memory Requirements in Media Servers
%A Chang, Edward
%A Chen, Yi-Yin
%D July 1998
%X Poor memory management policies lead to lower throughput and
               excessive memory requirements. This problem is aggravated in
               multimedia databases by the large volume and real-time data
               requirements. This study explores the temporal and spatial
               relationships among concurrent media streams. Specifically,
               we propose adding proper delays to space out IOs in a media
               server to give more room for buffer sharing among streams.
               Memory requirements can be reduced by trading time for space.
               We present and prove theorems that state the optimal IO
               schedules for reducing memory requirements for two cases:
               streams with the same required display rate and different
               display rates. We also show how the theorems can be put in
               practice to improve system performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/82/CS-TN-98-82.pdf

%R CS-TN-98-83
%Z Thu, 23 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stanford DLITE User Study
%A Mortensen, Mark
%D July 1998
%X User tests were conducted on the DLITE digital workspace.
               These consisted of observed use of the DLITE system, followed
               by an interview with the test administrator. The tests
               themselves were carried out both remotely over a network, and
               locally in the digital libraries lab on subjects with
               moderate computer knowledge. Initial tests resulted in system
               failures that caused DLITE to crash or become totally
               unusable. In subsequent tests, users noted a number of areas
               of DLITE that caused confusion. In particular, the
               instantiation of queries, the purpose and functioning of the
               graphics in the upper-left corner of objects, and the
               obscuring of objects in the workspace when dragging large
               components. Given the reactions of users in the post-test
               interview, these problems do not appear to be flaws in the
               design of DLITE, but implementation errors not intrinsic to
               the model upon which the functionality is based.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/83/CS-TN-98-83.pdf

%R CS-TN-98-84
%Z Thu, 23 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Lessons from Developing Audio HTML Interfaces
%A James, Frankie
%D July 1998
%X In this paper, we discuss our previous research on the
               establishment of guidelines and principles for choosing
               sounds to use in an audio interface to HTML, called the AHA
               framework. These principles, along with issues related to the
               target audience such as user tasks, goals, and interests are
               factors that can help us to choose specific sounds for the
               interface. We conclude by describing scenarios of two
               potential users and the interfaces that would seem to be
               appropriate for them.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/84/CS-TN-98-84.pdf

%R CS-TN-98-85
%Z Thu, 23 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Conjunctive Constraint Mapping for Data Translation
%A Chang, Chen-Chuan K.
%A Garcia-Molina, Hector
%D July 1998
%X In this paper we present a mechanism for translating
               information in heterogeneous digital library environments. We
               model information as a set of conjunctive constraints that
               are satisfied by real- world objects (e.g, documents, their
               metadata). Through application of semantic rules and value
               transformation functions, constraints are mapped into ones
               understood and supported in another context. Our machinery
               can also deal with hierarchically structured information.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/85/CS-TN-98-85.pdf

%R CS-TN-98-86
%Z Mon, 14 Sep 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Earth Mover's Distance as a Metric for Image Retrieval
%A Rubner, Yossi
%A Tomasi, Carlo
%A Guibas, Leonidas J.
%D September 1998
%X We introduce a metric between two distributions that we call
               the Earth Mover's Distance (EMD). The EMD is based on the
               minimal cost that must be paid to transform one distribution
               into the other, in a precise sense. We show that the EMD has
               attractive properties for content-based image retrieval. The
               most important one, as we show, is that it matches perceptual
               similarity better than other distances used for image
               retrieval. The EMD is based on a solution to the
               transportation problem from linear optimization, for which
               efficient algorithms are available, and also allows naturally
               for partial matching. It is more robust than histogram
               matching techniques, in that it can operate on
               variable-length representations of the distributions that
               avoid quantization and other binning problems typical of
               histograms. When used to compare distributions with the same
               overall mass, the EMD is a true metric. In this paper we
               focus on applications to color and texture, and we compare
               the retrieval performance of the EMD with that of other
               distances.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/86/CS-TN-98-86.pdf

%R CS-TN-98-87
%Z Mon, 14 Dec 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Scheduling Algebra
%A Glabbeek, R.J. van
%A Rittgen, P.
%D December 1998
%X The goal of this paper is to develop an algebraic theory of
               process scheduling. We specify a syntax for denoting
               processes composed of actions with given durations.
               Subsequently, we propose axioms for transforming any
               specification term of a scheduling problem into a term of all
               valid schedules. Here a schedule is a process in which all
               (implementational) choices (e.g. precise timing) are
               resolved. In particular, we axiomatize an operator
               restricting attention to the efficient schedules. These
               schedules turn out to be representable as trees, because in
               an efficient schedule actions start only at time zero or when
               a resource is released, i.e. upon termination of the action
               binding a required resource. All further delay would be
               useless. Nevertheless, we do not consider resource
               constraints explicitly here. We show that a normal form
               exists for every term of the algebra and establish soundness
               of our axiom system with respect to a schedule semantics, as
               well as completeness for efficient processes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/98/87/CS-TN-98-87.pdf

%R CS-TN-99-88
%Z Mon, 12 Jul 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Truth Revelation in Rapid, Approximately Efficient
               Combinatorial Auctions
%A Lehmann, Daniel
%A O'Callaghan, Liadan Ita
%A Shoham, Yoav
%D July 1999
%X Some important classical mechanisms considered in
               Microeconomics and Game Theory require the solution of a
               difficult optimization problem. This is true of mechanisms
               for combinatorial auctions, which have in recent years
               assumed practical importance, and in particular of the gold
               standard for combinatorial auctions, the Generalized Vickrey
               Auction (GVA). Traditional analysis of these mechanisms - in
               particular, their truth revelation properties - assumes that
               the optimization problems are solved precisely. In reality,
               these optimization problems can usually be solved only in an
               approximate fashion. We investigate the impact on such
               mechanisms of replacing exact solutions by approximate ones.
               Specifically, we look at a particular greedy optimization
               method, which has empirically been shown to perform well. We
               show that the GVA payment scheme does not provide for a truth
               revealing mechanism. We introduce another scheme that does
               guarantee truthfulness for a restricted class of players. We
               demonstrate the latter property by identifying sufficient
               conditions for a combinatorial auction to be truth-revealing,
               conditions which have applicability beyond the specific
               auction studied here.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/99/88/CS-TN-99-88.pdf

%R CS-TN-99-89
%Z Wed, 28 Jul 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Extending Greedy Multicast Routing to Delay Sensitive
               Applications
%A Goel, Ashish
%A Munagala, Kamesh
%D July 1999
%X Given a weighted undirected graph G(V,E) and a subset R of V,
               a Steiner tree is a subtree of G that contains each vertex in
               R. We present an online algorithm for finding a Steiner tree
               that simultaneously approximates the shortest path tree and
               the minimum weight Steiner tree, when the vertices in the set
               R are revealed in an online fashion. This problem arises
               naturally while trying to construct source-based multicast
               trees of low cost and good delay. The cost of the tree we
               construct is within an O(log |R|) factor of the optimal cost,
               and the path length from the root to any terminal is at most
               O(1) times the shortest path length. The algorithm needs to
               perform at most one reroute for each node in the tree. Our
               algorithm extends the results of Khuller etal and Awerbuch
               etal, who looked at the offline problem. We conduct extensive
               simulations to compare the performance of our algorithm (in
               terms of cost and delay) with that of two popular multicast
               routing strategies: shortest path trees and the online greedy
               Steiner tree algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/99/89/CS-TN-99-89.pdf

%R CS-TN-99-90
%Z Fri, 20 Aug 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Simulation of Iterative Matching for Combined Input and
               Output Queueing
%A Pichai, Srinivasan
%A Mudulodu, Sriram
%D August 1999
%X Since its introduction the Stable Marriage problem has been a
               subject of interest in mathematics and computer science.
               Recently this algorithm has found application in the area of
               switch scheduling algorithms for high performance switches.
               Inputs and Output ports of the switch compute their
               preference lists based on expected departure times for an
               ideal output queued switch. The stable matching as computed
               by the Galey-Shapley for this set of preferences determines
               the configuration of the interconnection fabric. The nature
               of the stable matching enables the emulation of an
               output-queued switch with combined input and output queueing
               using a speedup factor of 2. However it is important to
               compute the stable match efficiently for high performance.
               Hence parallel iterative versions of the algorithm have been
               proposed. In this report we investigate the convergence time
               of the parallel stable matching algorithm. The definition of
               the preference lists imposes special constraints on the
               problem and this reduces the worst case complexity of the
               algorithm. Simulations have shown that convergence time for
               the average case is also considerably lower than the general
               version of the algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/99/90/CS-TN-99-90.pdf

%R CS-TN-00-92
%Z Mon, 28 Feb 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Cost-Distance: Two Metric Network Design
%A Meyerson, Adam
%A Munagala, Kamesh
%A Plotkin, Serge
%D February 2000
%X We present the Cost-Distance problem: finding a Steiner tree
               which optimizes the sum of edge costs along one metric and
               the sum of source-sink distances along an unrelated second
               metric. We give the first known O(log k) randomized
               approximation scheme for Cost-Distance, where k is the number
               of sources. We reduce many common network design problems to
               Cost-Distance, obtaining (in some cases) the first known
               logarithmic approximation for them. These problems include
               single-sink buy-at-bulk with variable pipe types between
               different sets of nodes, and facility location with
               buy-at-bulk type costs on edges. Our algorithm is also the
               algorithm of choice for several previous network design
               problems, due to its ease of implementation and fast running
               time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/00/92/CS-TN-00-92.pdf

%R CS-TN-00-95
%Z Mon, 15 May 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Facility Location with Demand Dependent Costs and Generalized
               Clustering
%A Guha, Sudipto
%A Meyerson, Adam
%A Munagala, Kamesh
%D May 2000
%X We solve the vaiant of facility location problem in which the
               costs of facilities depend on the demand served, more
               specifically decrease with the demand served. We show
               application of this problem to generalized clustering
               problems which does not penalize large clusters.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/00/95/CS-TN-00-95.pdf

%R CS-TN-00-96
%Z Mon, 05 Jun 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Improved Combinatorial Algorithms for Single Sink Edge
               Installation Problems.
%A Guha, Sudipto
%A Meyerson, Adam
%A Munagala, Kamesh
%D June 2000
%X We present the first constant approximation to the single
               sink buy-at-bulk network design problem, where we have to
               design a network by buying pipes of different costs and
               capacities per unit length to route demands at a set of
               sources to a single sink. The distances in the underlying
               network form a metric. This result improves the previous
               bound of log |S|, where S is the set of sources. We also
               present an improved constant approximation to the related
               Access Network Design problem. Our algorithms are randomized
               and fully combinatorial. They can be derandomized easily at
               the cost of a constant factor loss in the approximation
               ratio.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/00/96/CS-TN-00-96.pdf

%R CS-TN-00-94
%Z Mon, 15 May 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Web Caching using Access Statistics
%A Meyerson, Adam
%A Munagala, Kamesh
%A Plotkin, Serge
%D May 2000
%X We present the problem of caching web pages under the
               assumption that each user has a fixed, known demand vector
               for the pages. Such demands could be computed using access
               statistics. We wish to place web pages in the caches in order
               to optimize the latency from user to page, under the
               constraints that each cache has limited memory and can
               support a limited total number of requests. When C caches are
               present with fixed locations, we present a constant factor
               approximation to the latency while exceeding capacity
               constraints by log C. We improve this result to a constant
               factor provided no replication of web pages is allowed. We
               present a constant factor approximation where the goal is to
               minimize the maximum latency. We also consider the case where
               we can place our own caches in the network for a cost, and
               produce a constant approximation to the sum of cache cost
               plus weighted average latency. Finally, we extend our results
               to incorporate page update latency, temporal variation in
               request rates, and economies of scale in cache costs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/00/94/CS-TN-00-94.pdf

%R CS-TN-00-93
%Z Mon, 15 May 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Hierarchical Placement and Network Design Problems.
%A Guha, Sudipto
%A Meyerson, Adam
%A Munagala, Kamesh
%D May 2000
%X In this paper, we give the first constant-approximations for
               a number of layered network design problems. We begin by
               modeling hierarchical caching, where caches are placed in
               layers and each layer satisfies a fixed percentage of the
               demand (bounded miss rates). We present a constant
               approximation to the minimum total cost of placing the caches
               and routing demand through the layers. We extend this model
               to cover more general layered caching scenarios, giving the
               first constant approximation to the well studied multi-level
               facility location problem. We consider a facility location
               variant, the Load Balanced Facility Location problem in which
               every demand is served by a unique facility and each open
               facility must serve at least a certain amount of demand. By
               combining Load Balanced Facility Location with our results on
               hierarchical caching, we give the first constant
               approximation for the Access Network Design problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/00/93/CS-TN-00-93.pdf

%R CS-TN-00-97
%Z Fri, 28 Jul 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Hardware Support for Tamper-Resistant and Copy-Resistant
               Software
%A Boneh, Dan
%A Lie, David
%A Lincoln, Pat
%A Mitchell, John
%A Mitchell, Mark
%D July 2000
%X Although there have been many attempts to develop code
               transformations that yield tamper-resistant software, no
               reliable software-only methods are known. Motivated by
               numerous potential applications, we investigate a prototype
               hardware mechanism that supports software tamper-resistance
               with an atomic decrypt-and-execute operation. Our hardware
               architecture uses a novel combination of standard
               architectural units. As usual, security has its costs. In
               this design, the most difficult security tradeoffs involve
               testability and performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/00/97/CS-TN-00-97.pdf

%R CS-TR-92-1432
%Z Thu, 28 Oct 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Overview of multidatabase transaction management
%A Breitbart, Yuri
%A Garcia-Molina, Hector
%A Silberschatz, Avi
%D October 1993
%X A multidatabase system (MDBS) is a facility that allows users
               access to data located in multiple autonomous database
               management systems (DBMSs).  In such a system, global
               transactions  are executed under the control of the MDBS.
               Independently, local transactions are executed under the
               control of the local DBMSs. Each local DBMS integrated by
               the MDBS may employ a different transaction management scheme.
               In addition, each local DBMS has complete control over all
               transactions (global and local) executing at its site,
               including the ability to abort at any point any of the
               transactions executing at its site.  Typically, no design
               or internal DBMS structure changes are allowed in order to
               accommodate the MDBS. Furthermore, the local DBMSs may not
               be aware of each other, and, as a consequence, cannot coordinate
               their actions. Thus, traditional techniques for ensuring
               transaction atomicity and consistency in homogeneous distributed
               database systems may not be appropriate for an MDBS environment.
               The objective of this paper is to provide a brief review of the
               most current work in the area of multidatabase transaction
               management. We first define the problem and argue that the
               multidatabase research will become increasingly important in
               the coming years. We then outline basic research issues in
               multidatabase transaction management and review recent results
               in the area. We conclude the paper with a discussion of open
               problems and practical implications of this research.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1432/CS-TR-92-1432.pdf

%R CS-TR-92-1431
%Z Fri, 22 Oct 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Aggressive transmissions over redundant paths for time critical
               messages
%A Kao, Ben
%A Garcia-Molina, Hector
%A Barbara, Daniel
%D October 1993
%X Fault tolerant computer systems have redundant paths connecting
               their components. Given these paths, it is possible to use
               aggressive techniques to reduce the average value and
               variability of the response time for critical messages. One
               technique is to send a copy of a packet over an alternate path
               before it is known if the first copy failed or was delayed. A
               second technique is to split a single stream of packets over
               multiple paths. We analize both approaches and show that they
               can provide significant improvements over conventional,
               conservative mechanisms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1431/CS-TR-92-1431.pdf

%R CS-TR-92-1435
%Z Tue, 19 Oct 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Lecture notes on approximation algorithms: Volume I
%A Motwani, Rajeev
%D October 1993
%X These lecture notes are based on the course CS351 (Dept. of
               Computer Science, Stanford University) offered during the
               academic year 1991-92. The notes below correspond to the first
               half of the course. The second half consists of topics such as
               AL4X SNP. cliques, and colorings, as well as more specialized
               material covering topics such as geometric problems, Steiner
               trees and multicommodity flows. The second half is being revised
               to incorporate the implications of recent results in
               approximation algorithms and the complexity of approximation
               problems. Please let me know if you would like to be on the
               mailing list for the second half. Comments, criticisms and
               corrections are welcome, please send them by electronic mail
               to rajeev@cs.Stanford.edu.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1435/CS-TR-92-1435.pdf

%R CS-TR-92-1426
%Z Wed, 03 Nov 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Proceedings of the ACM SIGPLAN Workshop on Continuations CW92
%A Danvy, Olivier (ed.)
%A Talcott, Carolyn (ed.)
%D November 1993
%X The notion of continuation is ubiquitous in many different
               areas of computer science, including logic, constructive
               mathematics, programming languages, and programming. This
               workshop aims at providing a forum for discussion of: new
               results and work in progress; work aimed at a better
               understanding of the nature of continuations; applications
               of continuations, and the relation of continuations to
               other areas of logic and computer science. 
               This technical report serves as informal proceedings for CW92.
               It consists of submitted manuscripts bound together according
               to the program order.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1426/CS-TR-92-1426.pdf

%R CS-TR-92-1423
%Z Fri, 05 Nov 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Time-lapse snapshots
%A Dwork, Cynthia
%A Herlihy, Maurice
%A Plotkin, Serge A.
%A Waarts, Orli
%D November 1993
%X Abstract. A snapshot scan algorithm takes an "instantaneous"
               picture of a region of shared memory that may he updated by
               concurrent processes. Many complex shared memory algorithms
               can be greatly simplified by structuring them around the
               snapshot scan abstraction. Unforinnately, the substantial
               decrease in conceptual complity is quite often counterbalanced
               by an increase in computational complexity. 
               In this paper, we introduce the notion of a weak snapshot scan,
               a slightly weaker primitive that has a more efficient
               implementation. We propose the following methodology for using
               this abstraction: first, design and verify an algorithm using
               the more powerful snapshot scan, and second, replace the more
               powerful but less efficient snapshot with the weaker but more
               efficient snapshot, and show that the weaker abstraction
               nevertheless suffices to ensure the correctness of the enclosing
               algorithm.
               We give two examples of algorithms whose performance can be
               enhanced while retaining a simple modular structure: bounded
               concurrent timestamping, and bounded randomized consensus.  The
               resulting timestamping protocol is the fastest known bounded
               concurrent timestamping protocol. The resulting randomized
               consensus protocol matches the computational complexity of the
               best known protocol that uses only bouned values.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1423/CS-TR-92-1423.pdf

%R CS-TR-92-1419
%Z Fri, 05 Nov 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast approximation algorithms for fractional packing and
               covering problems
%A Plotkin, Serge A.
%A Shmoys, David B.
%A Tardos, Eva
%D November 1993
%X This paper presents fast algorithms that find approximate
               solutions for a general class of problems, which we call
               fractional packing and covering problems. The only previously
               known algorithms for solving these problems are based on
               general linear programming techniques. The techniques
               developed in this paper greatly outperform the general
               methods in many applications, and are extensions of a
               method previously applied to find approximate solutions to
               multicommodity flow problems. Our algorithm is a Lagrangean
               relaxation technique; an important aspect of our results is
               that we obtain a theoretical analysis of the running time of
               a Lagrangean relaxation-based algorithm.          
               We give several applications of our algorithms. The new
               approach yields several orders of magnitude of improvement
               over the best previously known running times for the scheduling
               of unrelated parallel machines in both the preemptive and the
               non-preemptive models, for the job shop problem, for the
               cutting-stock problem, and for the minimum cost multicommodity
               flow problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1419/CS-TR-92-1419.pdf

%R CS-TR-92-1401
%Z Mon, 22 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The performance impact of data reuse in parallel dense
               Cholesky factorization
%A Rothberg, Edward
%A Gupta, Anoop
%D January 1992
%X This paper explores performance issues for several prominent
               approaches to parallel dense Cholesky factorization. The
               primary focus is on issues that arise when blocking
               techniques are integrated into parallel factorization
               approaches to improve data reuse in the memory hierarchy. We
               first consider panel-oriented approaches, where sets of
               contiguous columns are manipulated as single units. These
               methods represent natural extensions of the column-oriented
               methods that have been widely used previously. On machines
               with memory hierarchies, panel-oriented methods
               significantly increase the achieved performance over
               column-oriented methods. However, we find that panel-
               oriented methods do not expose enough concurrency for
               problems that one might reasonably expect to solve on
               moderately parallel machines, thus significantly limiting
               their performance. We then explore block-oriented approaches,
               where square submatrices are manipulated instead of sets of
               columns. These methods greatly increase the amount of
               available concurrency, thus alleviating the problems
               encountered with panel-oriented methods. However, a number of
               issues, including scheduling choices and block- placement
               issues, complicate their implementation. We discuss these
               issues and consider approaches that solve the resulting
               problems. The resulting block-oriented implementation yields
               high processor utilization levels over a wide range of
               problem sizes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1401/CS-TR-92-1401.pdf

%R CS-TR-92-1412
%Z Thu, 25 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Toward agent programs with circuit semantics
%A Nilsson, Nils J.
%D January 1992
%X New ideas are presented for computing and organizing actions
               for autonomous agents in dynamic environments - environments
               in which the agent's current situation cannot always be
               accurately discerned and in which the effects of actions
               cannot always be reliably predicted. The notion of "circuit
               semantics" for programs based on "teleo-reactive trees" is
               introduced. Program execution builds a combinational circuit
               which receives sensory inputs and controls actions. These
               formalisms embody a high degree of inherent conditionality
               and thus yield programs that are suitably reactive to their
               environments. At the same time, the actions computed by the
               programs are guided by the overall goals of the agent. The
               paper also speculates about how programs using these ideas
               could be automatically generated by artificial intelligence
               planning systems and adapted by learning methods.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1412/CS-TR-92-1412.pdf

%R CS-TR-92-1441
%Z Sun, 28 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Motion planning in stereotaxic radiosurgery
%A Schweikard, Achim
%A Adler, John R.
%A Latombe, Jean-Claude
%D September 1992
%X Stereotaxic radiosurgery is a procedure which uses a beam of
               radiation as an ablative surgical instrument to destroy brain
               tumors. The beam is produced by a linear accelerator which is
               moved by a jointed mechanism. Radiation is concentrated by
               crossfiring at the tumor from multiple directions and the
               amount of energy deposited in normal brain tissues is
               reduced. Because access to the tumor is obstructed along some
               directions by critical regions (e.g., brainstem, optic
               nerves) and most tumors are not shaped like spheres, planning
               the path of the beam is often difficult and time-consuming.
               This paper describes a computer-based planner developed to
               assist the surgeon generate a satisfactory path, given the
               spatial distribution of the brain tissues obtained with
               medical imaging. Experimental results with the implemented
               planner are presented, including a comparison with manually
               generated paths. According to these results, automatic
               planning significantly improves energy deposition. It can
               also shorten the overall treatment, hence reducing the
               patient's pain and allowing the radiosurgery equipment to be
               used for more patients. Stereotaxic radiosurgery is an
               example of so-called "bloodless surgery". Computer-based
               planning techniques are expected to facilitate further
               development of this safer, less painful, and more cost
               effective type of surgery.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1441/CS-TR-92-1441.pdf

%R CS-TR-92-1446
%Z Sun, 28 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Independent updates and incremental agreement in replicated
               databases
%A Ceri, Stefano
%A Houtsma, Maurice A. W.
%A Keller, Arthur M.
%A Samarati, Pierangela
%D October 1992
%X Update propagation and transaction atomicity are major
               obstacles to the development of replicated databases. Many
               practical applications, such as automated teller machine
               (ATM) networks, flight reservation, and part inventory
               control, do not really require these properties. In this
               paper we present an approach for incrementally updating a
               distributed, replicated database without requiring multi-site
               atomic commit protocols. We prove that the mechanism is
               correct, as it asymptotically performs all the updates on all
               the copies. Our approach has two important characteristics:
               it is progressive, and non-blocking. Progressive means that
               the transaction's coordinator always commits, possibly
               together with a group of other sites. The update is later
               propagated asynchronously to the remaining sites.
               Non-blocking means that each site can take unilateral
               decisions at each step of the algorithm. Sites which cannot
               commit updates are brought to the same final state by means
               of a reconciliation mechanism. This mechanism uses the
               history logs, which are stored locally at each site, to bring
               sites to agreement. It requires a small auxiliary data
               structure, called reception vector, to keep track of the time
               until which the other sites are guaranteed to be up-to-date.
               Several optimizations to the basic mechanism are also
               discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1446/CS-TR-92-1446.pdf

%R CS-TR-92-1452
%Z Sun, 28 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Deadline assignment in a distributed soft real-time system
%A Kao, Ben
%A Garcia-Molina, Hector
%D October 1992
%X In a distributed environment, tasks often have processing
               demands on multiple different sites. A distributed task is
               usually divided up into several subtasks, each one to be
               executed at some site in order. In a real-time system, an
               overall deadline is usually specified by an application
               designer indicating when a distributed task is to be
               finished. However, the problem of how a global deadline is
               automatically translated to the deadline of each individual
               subtask has not been well studied. This paper examines
               (through simulations) four strategies for subtask deadline
               assignment in a distributed soft real-time environment.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/92/1452/CS-TR-92-1452.pdf

%R CS-TR-93-1491
%Z Tue, 19 Oct 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Subtask Deadline Assignment for Complex Distributed Soft
               Real-Time Tasks
%A Kao, Ben
%A Garcia-Molina, Hector
%D October 1993
%X Complex distributed tasks often involve parallel execution of
               subtasks at different nodes. To meet the deadline of a global
               task, all of its parallel subtasks have to be finished on time.
               Comparing to a local task (which involves execution at only one
               node), a global task may have a much harder time making its
               deadline because it is fairly likely that at least one of its
               subtasks run into an overloaded node. Another problem with
               complex distributed tasks occurs when a global task consists of
               a number of serially executing subtasks. In this case, we have
               the problem of dividing up the end-to-end deadline of the global
               task and assigning them to the intermediate subtasks. In this
               paper, we study both of these problems. Different algorithms
               for assigning deadlines to subtasks are presented and evaluated. 
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/93/1491/CS-TR-93-1491.pdf

%R CS-TR-93-1494
%Z Wed, 08 Dec 93 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Index Structures for Information Filtering Under the Vector
               Space Model
%A Yan, Tak W.
%A Garcia-Molina, Hector
%D December 1993
%X With the ever increasing volumes of information generation,
               users of information systems are facing an information
               overload. It is desirable to support information filtering as
               a complement to traditional retrieval mechanism. The number
               of users, and thus profiles (representing users' long-term
               interests), handled by an information filtering system is
               potentially huge, and the system has to process a constant
               stream of incoming information in a timely fashion. The
               efficiency of the filtering process is thus an important
               issue.
               In this paper, we study what data structures and algorithms
               can be used to efficiently perform large-scale information
               filtering under the vector space model, a retrieval model
               established as being effective. We apply the idea of the
               standard inverted index to index user profiles. We devise an
               alternative to the standard inverted index, in which we,
               instead of indexing every term in a profile, select only the
               significant ones to index. We evaluate their performance and
               show that the indexing methods require orders of magnitude
               fewer I/Os to process a document than when no index is used.
               We also show that the proposed alternative performs better in
               terms of I/O and CPU processing time in many cases.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/93/1494/CS-TR-93-1494.pdf

%R CS-TR-93-1499
%Z Thu, 27 Jan 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Sandwich Theorem
%A Knuth, Donald E.
%D January 1994
%X This report contains expository notes about a function
               vartheta(G) that is popularly known as the Lovasz number of a
               graph G. There are many ways to define vartheta(G), and the
               surprising variety of different characterizations indicates
               in itself that vartheta(G) should be interesting. But the
               most interesting property of vartheta(G) is probably the fact
               that it can be computed efficiently, although it lies
               "sandwiched" between other classic graph numbers whose
               computation is NP-hard. I have tried to make these notes
               self-contained so that they might serve as an elementary
               introduction to the growing literature on Lovasz's
               fascinating function.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/93/1499/CS-TR-93-1499.pdf

%R CS-TR-94-1500
%Z Thu, 27 Jan 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T 1993 Publications Summary for the Stanford Database Group
%A Siroker, Marianne
%D January 1994
%X This Technical Report contains the first page of papers
               written by members of the Stanford Database Group during
               1993. Readers interested in the full papers can fetch
               electronic copies via FTP.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1500/CS-TR-94-1500.pdf

%R CS-TR-94-1501
%Z Mon, 28 Feb 94 00:00:00 GMT 
%I Stanford University, Department of Comuputer Science
%T Deriving Properties of Belief Update from Theories of Action
%A Val, Alvaro del
%A Shoham, Yoav
%D February 1994
%X We present an approach to database update as a form of non
               monotonic temporal reasoning, the main idea of which is the
               (circumscriptive) minimization of changes with respect to a
               set of facts declared ``persistent by default.'' The focus of
               the paper is on the relation between this approach and the
               update semantics recently proposed by Katsuno and Mendelzon.
               Our contribution in this regard is twofold: 
               - We prove a representation theorem for KM semantics in terms
               of a restricted subfamily of the operators defined by our
               construction. 
               - We show how the KM semantics can be generalized by relaxing
               our construction in a number of ways, each justified in
               certain intuitive circumstances and each corresponding to one
               specific postulate. It follows that there are reasonable
               update operators outside the KM family.
               Our approach is not dependent for its plausibility on this
               connection with KM semantics. Rather, it provides a
               relatively rich and flexible framework in which the frame and
               ramification problems can be solved in a systematic way by
               reasoning about default persistence of facts.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1501/CS-TR-94-1501.pdf

%R CS-TR-94-1502
%Z Mon, 28 Feb 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Natural Language Parsing as Statistical Pattern Recognition
%A Magerman, David M.
%D February 1994
%X Traditional natural language parsers are based on rewrite
               rule systems developed in an arduous, time-consuming manner
               by grammarians. A majority of the grammarian's efforts are
               devoted to the disambiguation process, first hypothesizing
               rules which dictate constituent categories and relationships
               among words in ambiguous sentences, and then seeking
               exceptions and corrections to these rules. 
               In this work, I propose an automatic method for acquiring a
               statistical parser from a set of parsed sentences which takes
               advantage of some initial linguistic input, but avoids the
               pitfalls of the iterative and seemingly endless grammar
               development process. Based on distributionally-derived and
               linguistically-based features of language, this parser
               acquires a set of statistical decision trees which assign a
               probability distribution on the space of parse trees given
               the input sentence. By basing the disambiguation criteria
               selection on entropy reduction rather than human intuition,
               this parser development method is able to consider more
               sentences than a human grammarian can when making individual
               disambiguation rules.
               In experiments, the decision tree parser significantly
               outperforms a grammarian's rule-based parser, achieving an
               accuracy rate of 78% compared to the rule-based parser's 69%.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1502/CS-TR-94-1502.pdf

%R CS-TR-94-1503
%Z Mon, 28 Feb 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Deciding whether to plan to react
%A Dabija, Vlad G.
%D February 1994
%X Intelligent agents that operate in real-world real-time
               environments have limited resources. An agent must take these
               limitations into account when deciding which of two control
               modes - planning versus reaction - should control its
               behavior in a given situation. The main goal of this thesis
               is to develop a framework that allows a resource-bounded
               agent to decide at planning time which control mode to adopt
               for anticipated possible run-time contingencies. Using our
               framework, the agent: (a) analyzes a complete (conditional)
               plan for achieving a particular goal; (b) decides which of
               the anticipated contingencies require and allow for
               preparation of reactive responses at planning time; and (c)
               enhances the plan with prepared reactions for critical
               contingencies, while maintaining the size of the plan, the
               planning and response times, and the use of all other
               critical resources of the agent within task-specific limits.
               For a given contingency, the decision to plan or react is
               based on the characteristics of the contingency, the
               associated reactive response, and the situation itself.
               Contingencies that may occur in the same situation compete
               for reactive response preparation because of the agent's
               limited resources. The thesis also proposes a knowledge
               representation formalism to facilitate the acquisition and
               maintenance of knowledge involved in this decision process.
               We also show how the proposed framework can be adapted for
               the problem of deciding, for a given contingency, whether to
               prepare a special branch in the conditional plan under
               development or to leave the contingency for opportunistic
               treatment at execution time. We make a theoretical analysis
               of the properties of our framework and then demonstrate them
               experimentally. We also show experimentally that this
               framework can simulate several different styles of human
               reactive behaviors described in the literature and,
               therefore, can be useful as a basis for describing and
               contrasting such behaviors. Finally we demonstrate that the
               framework can be applied in a challenging real domain. That
               is: (a) the knowledge and data needed for the decision making
               within our framework exist and can be acquired from experts,
               and (b) the behavior of an agent that uses our framework
               improves according to response time, reliability and resource
               utilization criteria.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1503/CS-TR-94-1503.pdf

%R CS-TR-94-1504
%Z Mon, 28 Feb 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Algebraic Approach to Rule Analysis in Expert Database
               Systems
%A Baralis, Elena
%A Widom, Jennifer
%D February 1994
%X Expert database systems extend the functionality of
               conventional database systems by providing a facility for
               creating and automatically executing Condition-Action rules.
               While Condition-Action rules in database systems are very
               powerful, they also can be very difficult to program, due to
               the unstructured and unpredictable nature of rule processing.
               We provide methods for static analysis of Condition-Action
               rules; our methods determine whether a given rule set is
               guaranteed to terminate, and whether rule execution is
               confluent (has a guaranteed unique final state). Our methods
               are based on previous methods for analyzing rules in active
               database systems. We improve considerably on the previous
               methods by providing analysis criteria that are much less
               conservative: our methods often determine that a rule set
               will terminate or is confluent when previous methods could
               not. Our improved analysis is based on a ``propagation''
               algorithm, which uses a formal approach based on an extended
               relational algebra to accurately determine when the action of
               one rule can affect the condition of another. Our algebraic
               approach yields methods that are applicable to a broad class
               of expert database rule languages.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1504/CS-TR-94-1504.pdf

%R CS-TR-94-1505
%Z Fri, 04 Mar 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Using a Position History-Based Protocol for Distributed
               Object Visualization
%A Singhal, Sandeep K.
%A Cheriton, David R.
%D February 1994
%X Users of distributed virtual reality applications interact
               with users located across the network. Similarly, distributed
               object visualization systems store dynamic data at one host
               and render it in real-time at other hosts. Because data in
               both systems is animated and exhibits unpredictable behavior,
               providing up-to-date information about remote objects is
               expensive. Remote hosts must instead apply extrapolation
               between successive update packets to render the object's true
               animated behavior. This paper describes and analyzes a
               ``position history-based'' protocol in which hosts apply
               several recent position updates to track the position of
               remote objects. The history-based approach offers smooth,
               accurate visualizations of remote objects while providing a
               scalable solution.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1505/CS-TR-94-1505.pdf

%R CS-TR-94-1506
%Z Mon, 28 Feb 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Optimized Memory-Based Messaging: Leveraging the Memory
               System for High-Performance Communication
%A Cheriton, David R.
%A Kutter, Robert A.
%D February 1994
%X Memory-based messaging, passing messages between programs
               using shared memory, is a recognized technique for efficient
               communication that takes advantage of memory system
               performance. However, the conventional operating system
               support for this approach is inefficient, especially for
               large-scale multiprocessor interconnects, and is too complex
               to effectively support in hardware. This paper describes
               hardware and software optimizations for memory-based
               messaging that efficiently exploit the mechanisms of the
               memory system to provide superior communication performance.
               We describe the overall model of optimized memory-based
               messaging, its implementation in an operating system kernel
               and hardware support for this approach in a scalable
               multiprocessor architecture. The optimizations include
               address-valued signals, message-oriented memory consistency
               and automatic signaling on write. Performance evaluations
               show these extensions provide a three-to-five-fold
               improvement in communication performance over a comparable
               software-only implementation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1506/CS-TR-94-1506.pdf

%R CS-TR-94-1507
%Z Wed, 02 Mar 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Bibliography Department of Computer Science Technical Reports,
               1963-1993
%A Mashack, Thea
%D March 1994
%X This Bibliography lists all the reports published by the
               Department of Computer Science from 1963 through 1993
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1507/CS-TR-94-1507.pdf

%R CS-TR-94-1508
%Z Tue, 22 Mar 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Inverse Kinematics of a Human Arm
%A Kondo, Koichi
%D March 1994
%X This paper describes a new inverse kinematics algorithm for a
               human arm. Potential applications of this algorithm include
               computer-aided design and concurrent engineering from the
               viewpoint of human factors. For example, it may be used to
               evaluate a new design in terms of its usability and to
               automatically generate instruction videos. The inverse
               kinematics algorithm is based on a sensorimotor
               transformation model developed in recent neurophysiological
               experiments. This method can be applied to both static arm
               postures and human manipulation motions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1508/CS-TR-94-1508.pdf

%R CS-TR-94-1509
%Z Tue, 22 Mar 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Global Price Updates Help
%A Goldberg, Andrew V.
%A Kennedy, Robert
%D March 1994
%X Periodic global updates of dual variables have been shown to
               yield a substantial speed advantage in implementations of
               push-relabel algorithms for the maximum flow and minimum cost
               flow problems. In this paper, we show that in the context of
               the bipartite matching and assignment problems, global
               updates yield a theoretical improvement as well. For
               bipartite matching, a push-relabel algorithm that matches the
               best bound when global updates are used achieves a bound that
               is worse by a square root of n factor without the updates. A
               similar result holds for the assignment problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1509/CS-TR-94-1509.pdf

%R CS-TR-94-1510
%Z Tue, 22 Mar 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Key Objects in Garbage Collection
%A Hayes, Barry
%D March 1994
%X When the cost of global garbage collection in a system grows
               large, the system can be redesigned to use generational
               collection. The newly-created objects usually have a much
               shorter half-life than average, and by concentrating the
               collector's efforts on them a large fraction of the garbage
               can be collected at a tiny fraction of the cost.
               The objects that survive generational collection may still
               become garbage, and the current practice is to perform
               occasional global garbage collections to purge these objects
               from the system, and again, the cost of doing these
               collections may become prohibitive when the volume of memory
               increases. Previous research has noted that the objects that
               survive generational collection often are born, promoted, and
               collected in large clusters. 
               In this dissertation I show that carefully selected
               semantically or structurally important key objects can be
               drawn from the clusters and collected separately; when a key
               object becomes unreachable, the collector can take this as a
               hint to collect the cluster from which the key was drawn.
               To gauge the effectiveness of key objects, their use was
               simulated in ParcPlace's Objectworks\Smalltalk system. The
               objects selected as keys were those that, as young objects,
               had pointers to them stored into old objects. The collector
               attempts to create a cluster for each key by gathering
               together all of the objects reachable from that key and >From
               no previous key.
               Using this simple heuristic for key objects, the collector
               finds between 41% and 92% of the clustered garbage in a suite
               of simple test programs. Except for one program in the suite,
               about 95% of the time these key objects direct the collector
               to a cluster that is garbage. The exception should be heeded
               in improving the heuristics. In a replay of an interactive
               session, key object collection finds 59% of the clustered
               garbage and 66% of suggested targets are indeed garbage.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1510/CS-TR-94-1510.pdf

%R CS-TR-94-1511
%Z Tue, 19 Apr 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Co-Learning and the Evolution of Social Acitivity
%A Shoham, Yoav
%A Tennenholtz, Moshe
%D April 1994
%X We introduce the notion of co-learning, which refers to a
               process in which several agents simultaneously try to adapt
               to one another's behavior so as to produce desirable global
               system properties. Of particular interest are two specific
               co-learning settings, which relate to the emergence of
               conventions and the evolution of cooperation in societies,
               respectively. We define a basic co-learning rule, called
               Highest Cumulative Reward (HCR), and show that it gives rise
               to quite nontrivial system dynamics. In general, we are
               interested in the eventual convergence of the co-learning
               system to desirable states, as well as in the efficiency with
               which this convergence is attained. Our results on eventual
               convergence are analytic; the results on efficiency
               properties include analytic lower bounds as well as empirical
               upper bounds derived from rigorous computer simulations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1511/CS-TR-94-1511.pdf

%R CS-TR-94-1512
%Z Tue, 19 Apr 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Abstraction Planning in Real Time
%A Washington, Richard
%D April 1994
%X When a planning agent works in a complex, real-world domain,
               it is unable to plan for and store all possible contingencies
               and problem situations ahead of time. The agent needs to be
               able to fall back on an ability to construct plans at run
               time under time constraints. This thesis presents a method
               for planning at run time that incrementally builds up plans
               at multiple levels of abstraction. The plans are continually
               updated by information from the world, allowing the planner
               to adjust its plan to a changing world during the planning
               process. All the information is represented over intervals of
               time, allowing the planner to reason about durations,
               deadlines, and delays within its plan. In addition to the
               method, the thesis presents a formal model of the planning
               process and uses the model to investigate planning
               strategies. The method has been implemented, and experiments
               have been run to validate the overall approach and the
               theoretical model.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1512/CS-TR-94-1512.pdf

%R CS-TR-94-1513
%Z Tue, 03 May 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Construction of Normative Decision Models Using Abstract
               Graph Grammars
%A Egar, John W.
%D May 1994
%X This dissertation addresses automated assistance for decision
               analysis in medicine. In particular, I have investigated
               graph grammars as a representation for encoding how
               decision-theoretic models can be constructed from an
               unordered list of concerns. The modeling system that I have
               used requires a standard vocabulary to generate decision
               models; the models generated are qualitative, and require
               subsequent assessment of probabilities and utility values.
               This research has focused on the modeling of the qualitative
               structure of problems given a standard vocabulary and given
               that subsequent assessment of probabilities and utilities is
               possible. The usefulness of the graph-grammar representation
               depends on the graph-grammar formalism's ability to describe
               a broad spectrum of qualitative decision models, on its
               ability to maintain a high quality in the models it
               generates, and on its clarity in describing topological
               constraints to researchers who design and maintain the actual
               grammar. I have found that graph grammars can be used to
               generate automatically decision models that are comparable to
               those produced by decision analysts.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1513/CS-TR-94-1513.pdf

%R CS-TR-94-1514
%Z Wed, 11 May 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Load Balancing Using Time Series Analysis for Soft Real Time
               Systems with Statistically Periodic Loads
%A Hailperin, Max
%D May 1994
%X This thesis provides design and analysis of techniques for
               global load balancing on ensemble architectures running
               soft-real-time object-oriented applications with
               statistically periodic loads. It focuses on estimating the
               instantaneous average load over all the processing elements. 
               The major contribution is the use of explicit stochastic
               process models for both the loading and the averaging itself.
               These models are exploited via statistical time-series
               analysis and Bayesian inference to provide improved average
               load estimates, and thus to facilitate global load balancing. 
               This thesis explains the distributed algorithms used and
               provides some optimality results. It also describes the
               algorithms' implementation and gives performance results from
               simulation. These results show that our techniques allow more
               accurate estimation of the global system loading, resulting
               in fewer object migrations than local methods. Our method is
               shown to provide superior performance, relative not only to
               static load-balancing schemes but also to many adaptive
               load-balancing methods. Results from a preliminary analysis
               of another system and from simulation with a synthetic load
               provide some evidence of more general applicability.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1514/CS-TR-94-1514.pdf

%R CS-TR-94-1515
%Z Mon, 06 Jun 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Retrieving Semantically Distant Analogies
%A Wolverton, Michael
%D June 1994
%X Techniques that have traditionally been useful for retrieving
               same-domain analogies from small single-use knowledge bases,
               such as spreading activation and indexing on selected
               features, are inadequate for retrieving cross-domain
               analogies from large multi-use knowledge bases. Blind or
               near-blind search techniques like spreading activation will
               be overwhelmed by combinatorial explosion as the search goes
               deeper into the KB. And indexing a large multi-use KB on
               salient features is impractical, largely because a feature
               that may be useful for retrieval in one task may be useless
               for another task. This thesis describes Knowledge-Directed
               Spreading Activation (KDSA), a method for retrieving
               analogies in a large semantic network. KDSA uses
               task-specific knowledge to guide a spreading activation
               search to a case or concept in memory that meets a desired
               similarity condition. The thesis also describes a specific
               instantiation of this method for the task of innovative
               design. 
               KDSA has been validated in two ways. First, a theoretical
               model of knowledge base search demonstrates that KDSA is
               tractable for retrieving semantically distant analogies under
               a wide range of knowledge base configurations. Second, an
               implemented system that uses KDSA to find analogies for
               innovative design shows that the method is able to retrieve
               semantically distant analogies for a real task. Experiments
               with that system show trends as the knowledge base size grows
               that suggest the theoretical model's prediction of large
               knowledge base tractability is accurate.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1515/CS-TR-94-1515.pdf

%R CS-TR-94-1516
%Z Mon, 08 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Framework for Reasoning Precisely with Vague Concepts
%A Goyal, Nita
%D May 1994
%X Many knowledge-based systems need to represent vague concepts
               such as ``old'' and ``tall''. The practical approach of
               representing vague concepts as precise intervals over numbers
               (e.g., ``old'' as the interval [70,110]) is well-accepted in
               Artificial Intelligence. However, there have been no
               systematic procedures, but only ad hoc methods to delimit the
               boundaries of intervals representing the vague predicates. A
               key observation is that the vague concepts and their interval
               boundaries are constrained by the underlying domain
               knowledge. Therefore, any systematic approach to assigning
               interval boundaries must take the domain knowledge into
               account. Hence, in the dissertation, we present a framework
               to represent the domain knowledge and exploit it to reason
               about the interval boundaries via a query language. This
               framework is comprised of a constraint language to represent
               logical constraints on vague concepts, as well as numerical
               constraints on the interval boundaries; a query language to
               request information about the interval boundaries; and an
               algorithm to answer the queries. The algorithm preprocesses
               the constraints by extracting the numerical information from
               the logical constraints and combines them with the given
               numerical constraints. We have implemented the framework and
               applied it to medical domain to illustrate its usefulness.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1516/CS-TR-94-1516.pdf

%R CS-TR-94-1517
%Z Tue, 09 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reactive, Generative and Stratified Models of Probabilistic
               Processes
%A Glabbeek, Rob J. van
%A Smolka, Scott A.
%A Steffen, Bernhard
%D July 1994
%X We introduce three models of probabilistic processes, namely,
               reactive, generative and stratified. These models are
               investigated within the context of PCCS, an extension of
               Milner's SCCS in which each summand of a process summation
               expression is guarded by a probability and the sum of these
               probabilities is 1. For each model we present a structural
               operational semantics of PCCS and a notion of bisimulation
               equivalence which we prove to be a congruence. We also show
               that the models form a hierarchy: the reactive model is
               derivable from the generative model by abstraction from the
               relative probabilities of different actions, and the
               generative model is derivable from the stratified model by
               abstraction from the purely probabilistic branching
               structure. Moreover the classical nonprobabilistic model is
               derivable from each of these models by abstraction from all
               probabilities.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1517/CS-TR-94-1517.pdf

%R CS-TR-94-1518
%Z Fri, 12 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T STeP: The Stanford Temporal Prover
%A Manna, Z ohar
%A Anuchitanukul, Anuchit
%A Bjorner, Nikolaj
%A Browne, Anca
%A Chang, Edward
%A Colon, Michael
%A de Alfaro, Luca
%A Devarajan, Harish
%A Sipma, Henny
%A Uribe, Tomas
%D June 1994
%X We describe the Stanford Temporal Prover (STeP), a system
               being developed to support the computer-aided formal
               verification of concurrent and reactive systems based on
               temporal specifications. Unlike systems based on
               model-checking, STeP is not restricted to finite-state
               systems. It combines model checking and deductive methods to
               allow the verification of a broad class of systems, including
               programs with infinite data domains, N-process programs, and
               N-component circuit designs, for arbitrary N. In short, STeP
               has been designed with the objective of combining the
               expressiveness of deductive methods with the simplicity of
               model checking.
               The verification process is for the most part automatic. User
               interaction occurs mostly at the highest, most intuitive
               level, primarily through a graphical proof language of
               verification diagrams. Efficient simplification methods,
               decision procedures, and invariant generation techniques are
               then invoked automatically to prove resulting first-order
               verification conditions with minimal assistance.
               We describe the performance of the system when applied to
               several examples, including the N-process dining
               philosopher's program, Szymanski's N-process mutual exclusion
               algorithm, and a distributed N-way arbiter circuit.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1518/CS-TR-94-1518.pdf

%R CS-TR-94-1519
%Z Fri, 26 Aug 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Probabilistic Roadmaps for Path Planning in High-Dimensional
               Configuration Spaces
%A Kavraki, Lydia
%A Svestka, Petr
%A Latombe, Jean-Claude
%A Overmars, Mark
%D August 1994
%X A new motion planning method for robots in static workspaces
               is presented. This method proceeds according to two phases: a
               learning phase and a query phase. In the learning phase, a
               probabilistic roadmap is constructed and stored as a graph
               whose nodes correspond to collision-free configurations and
               edges to feasible paths between these configurations. These
               paths are computed using a simple and fast local planner. In
               the query phase, any given start and goal configurations of
               the robot are connected to two nodes of the roadmap; the
               roadmap is then searched for a path joining these two nodes.
               The method is general and easy to implement. It can be
               applied to virtually any type of holonomic robot. It requires
               selecting certain parameters (e.g., the duration of the
               learning phase) whose values depend on the considered scenes,
               that is the robots and their workspaces. But these values
               turn out to be relatively easy to choose. Increased
               efficiency can also be achieved by tailoring some components
               of the method (e.g., the local planner) to the considered
               robots. In this paper the method is applied to planar
               articulated robots with many degrees of freedom. Experimental
               results show that path planning can be done in a fraction of
               a second on a contemporary workstation (approximately 150
               MIPS), after learning for relatively short periods of time (a
               few dozen seconds).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1519/CS-TR-94-1519.pdf

%R CS-TR-94-1520
%Z Thu, 08 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Adaptive Optimization for SELF: Reconciling High Performance
               with Exploratory Programming
%A Holzle, Urs
%D August 1994
%X Crossing abstraction boundaries often incurs a substantial
               run-time overhead in the form of frequent procedure calls.
               Thus, pervasive use of abstraction, while desirable from a
               design standpoint, may lead to very inefficient programs.
               Aggressively optimizing compilers can reduce this overhead
               but conflict with interactive programming environments
               because they introduce long compilation pauses and often
               preclude source-level debugging. Thus, programmers are caught
               on the horns of two dilemmas: they have to choose between
               abstraction and efficiency, and between responsive
               programming environments and efficiency. This dissertation
               shows how to reconcile these seemingly contradictory goals.
               Four new techniques work together to achieve this:
               - Type feedback achieves high performance by allowing the
               compiler to inline message sends based on information
               extracted from the runtime system.
               - Adaptive optimization achieves high responsiveness without
               sacrificing performance by using a fast compiler to generate
               initial code while automatically recompiling heavily used
               program parts with an optimizing compiler.
               - Dynamic deoptimization allows source-level debugging of
               optimized code by transparently recreating non-optimized code
               as needed.
               - Polymorphic inline caching speeds up message dispatch and,
               more significantly, collects concrete type information for
               the compiler.
               With better performance yet good interactive behavior, these
               techniques reconcile exploratory programming, ubiquitous
               abstraction, and high performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1520/CS-TR-94-1520.pdf

%R CS-TR-94-1521
%Z Thu, 08 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Chu Spaces : A Model for Concurrency
%A Gupta, Vineet
%D August 1994
%X A Chu space is a binary relation between two sets. In this
               thesis we show that Chu spaces form a non-interleaving model
               of concurrency which extends event structures while endowing
               them with an algebraic structure whose natural logic is
               linear logic.
               We provide several equivalent definitions of Chu spaces,
               including two pictorial representations. Chu spaces represent
               processes as automata or schedules, and Chu duality gives a
               simple way of converting between schedules and automata. We
               show that Chu spaces can represent various concurrency
               concepts like conflict, temporal precedence and internal and
               external choice, and they distinguish between causing and
               enabling events.
               We present a process algebra for Chu spaces including the
               standard combinators like parallel composition, sequential
               composition, choice, interaction, restriction, and show that
               the various operational identities between these hold for Chu
               spaces. The solution of recursive domain equations is
               possible for most of these operations, giving us an
               expressive specification and programming language. We define
               a history preserving equivalence between Chu spaces, and show
               that it preserves the causal structure of a process.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1521/CS-TR-94-1521.pdf

%R CS-TR-94-1523
%Z Thu, 08 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On Implementing Push-Relabel Method for the Maximum Flow
               Problem
%A Cherkassky, Boris V.
%A Goldberg, Andrew V.
%D September 1994
%X We study efficient implementations of the push-relabel method
               for the maximum flow problem. The resulting codes are faster
               than the previous codes, and much faster on some problem
               families. The speedup is due to the combination of heuristics
               used in our implementation. We also exhibit a family of
               problems for which all known methods seem to have almost
               quadratic time growth rate.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1523/CS-TR-94-1523.pdf

%R CS-TR-94-1524
%Z Thu, 08 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Continuous Verification by Discrete Reasoning
%A de Alfaro, Luca
%A Manna, Z ohar
%D September 1994
%X Two semantics are commonly used for the behavior of real-time
               and hybrid systems: a discrete semantics, in which the
               temporal evolution is represented as a sequence of snapshots
               describing the state of the system at certain times, and a
               continuous semantics, in which the temporal evolution is
               represented by a series of time intervals, and therefore
               corresponds more closely to the physical reality. Powerful
               verification rules are known for temporal logic formulas
               based on the discrete semantics.
               This paper shows how to transfer the verification techniques
               of the discrete semantics to the continuous one. We show that
               if a temporal logic formula has the property of finite
               variability, its validity in the discrete semantics implies
               its validity in the continuous one. This leads to a
               verification method based on three components: verification
               rules for the discrete semantics, axioms about time, and some
               temporal reasoning to bring the results together. This
               approach enables the verification of properties of real-time
               and hybrid systems with respect to the continuous semantics.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1524/CS-TR-94-1524.pdf

%R CS-TR-94-1525
%Z Mon, 12 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Differential BDDs
%A Anuchitanukul, Anuchit
%A Manna, Z ohar
%D September 1994
%X In this paper, we introduce a class of Binary Decision
               Diagrams (BDDs) which we call Differential BDDs (DBDDs), and
               two transformations over DBDDs, called Push-up and Delta
               transformations. In DBDDs and its derived classes such as
               Push-up DBDDs or Delta DBDDs, in addition to the ordinary
               node-sharing in the normal Ordered Binary Decision Diagrams
               (OBDDs), some isomorphic substructures are collapsed together
               forming an even more compact representation of boolean
               functions. The elimination of isomorphic substructures
               coincides with the repetitive occurrences of the same or
               similar small components in many applications of BDDs such as
               in the representation of hardware circuits. The reduction in
               the number of nodes, from OBDDs to DBDDs, is potentially
               exponential while boolean manipulations on DBDDs remain
               efficient.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1525/CS-TR-94-1525.pdf

%R CS-TR-94-1526
%Z Fri, 16 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Combining Experiential and Theoretical Knowledge in the
               Domain of Semiconductor Manufacturing
%A Mohammed, John Llewelyn
%D September 1994
%X Semiconductor Manufacturing is characterized by complexity
               and continual, rapid change. These characteristics reduce the
               effectiveness of traditional diagnostic expert systems: the
               knowledge represented cannot adapt to changes in the
               manufacturing plan because the dependence of the knowledge on
               the plan is not explicitly represented. It is impractical to
               manually encode all the dependencies in a complex plan.
               We address this problem in two ways. First, we employ
               model-based techniques to encode theoretical knowledge, so
               that symbolic simulation of a new manufacturing plan can
               automatically glean diagnostic information. Our
               representation is sufficiently detailed to capture the plan's
               inherent causal dependencies, yet sufficiently abstract to
               make symbolic simulation practical. This theoretical
               knowledge can adapt to changes in the manufacturing plan.
               However, the expressiveness and tractability of our
               representational machinery limit the range of phenomena that
               we can represent.
               Second, we describe Generic Rules, which combine the
               expressiveness of heuristic rules with the robustness of
               theoretical models. Generic Rules are general patterns for
               heuristic rules, associated with model-based restrictions on
               the situations in which the patterns can be instantiated to
               form rules for new contexts. In this way, theoretical
               knowledge is employed to encode the dependence of heuristic
               knowledge on the manufacturing plan.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1526/CS-TR-94-1526.pdf

%R CS-TR-94-1527
%Z Wed, 12 Oct 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T From Knowledge to Belief
%A Koller, Daphne
%D October 1994
%X When acting in the real world, an intelligent agent must make
               decisions under uncertainty. The standard solution requires
               it to assign degrees of belief to the relevant assertions.
               These should be based on the agent's knowledge. For example,
               a doctor deciding on the treatment for a patient should use
               information about that patient, statistical correlations
               between symptoms and diseases, default rules, and more.
               The random-worlds method induces degrees of belief from very
               rich knowledge bases, expressed in a language that augments
               first-order logic with statistical statements and default
               rules (interpreted as qualitative statistics). The method is
               based on the principle of indifference, treating all possible
               worlds as equally likely. It naturally derives important
               patterns of reasoning such as specificity, inheritance,
               indifference to irrelevant information, and a default
               assumption of independence. Its expressive power and
               intuitive semantics allow it to deal well with examples that
               are too complex for most other reasoning systems.
               We use techniques from finite model theory to analyze the
               computational aspects of random worlds. The problem of
               computing degrees of belief is undecidable in general.
               However, for unary knowledge bases, a tight connection to the
               principle of maximum entropy often allows us to compute
               degrees of belief.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1527/CS-TR-94-1527.pdf

%R CS-TR-94-1528
%Z Thu, 27 Oct 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Architecture-Altering Operations for Evolving the
               Architecture of a Multi-Part Program in Genetic Programming
%A Koza, John R.
%D October 1994
%X Previous work described a way to evolutionarily select the
               architecture of a multi-part computer program >From among
               preexisting alternatives in the population while concurrently
               solving a problem during a run of genetic programming. This
               report describes six new architecture-altering operations
               that provide a way to evolve the architecture of a multi-part
               program in the sense of actually changing the architecture of
               programs dynamically during the run. The new
               architecture-altering operations are motivated by the
               naturally occurring operation of gene duplication as
               described in Susumu Ohno's provocative 1970 book Evolution by
               Means of Gene Duplication as well as the naturally occurring
               operation of gene deletion. The six new architecture-altering
               operations are branch duplication, argument duplication,
               branch creation, argument creation, branch deletion and
               argument deletion. A connection is made between genetic
               programming and other techniques of automated problem solving
               by interpreting the architecture-altering operations as
               providing an automated way to specialize and generalize
               programs. The report demonstrates that a hierarchical
               architecture can be evolved to solve an illustrative symbolic
               regression problem using the architecture- altering
               operations. Future work will study the amount of additional
               computational effort required to employ the
               architecture-altering operations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1528/CS-TR-94-1528.pdf

%R CS-TR-94-1529
%Z Mon, 31 Oct 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A knowledge-based method for temporal abstraction of clinical data
%A Shahar, Yuval
%D October 1994
%X This dissertation describes a domain-independent method
               specific to the task of abstracting higher-level concepts
               from time-stamped data. The framework includes a model of
               time, parameters, events and contexts. I applied my framework
               to several domains of medicine. My goal is to create, from
               time-stamped patient data, interval-based temporal
               abstractions such as "severe anemia for 3 weeks in the
               context of administering AZ T."
               The knowledge-based temporal-abstraction method decomposes
               the task of abstracting higher-level abstractions from input
               data into five subtasks. These subtasks are solved by five
               domain-independent temporal-abstraction mechanisms. The
               temporal-abstraction mechanisms depend on four
               domain-specific knowledge types.
               I implemented the knowledge-based temporal-abstraction method
               in the RESUME system. RESUME accepts input and returns output
               at all levels of abstraction; accepts input out of temporal
               order, modifying a view of the past or of the present, as
               necessary; generates context-sensitive, controlled output;
               and maintains several possible concurrent interpretations of
               the data.
               I evaluated RESUME in the domains of protocol-based care,
               monitoring of children's growth, and therapy of diabetes.
               A formal specification of a domain's temporal-abstraction
               knowledge supports acquisition, maintenance, reuse, and
               sharing of that knowledge.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1529/CS-TR-94-1529.pdf

%R CS-TR-94-1530
%Z Wed, 09 Nov 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On Computing Multi-Arm Manipulation Trajectories
%A Koga, Yoshihito
%D October 1994
%X This dissertation considers the manipulation task planning
               problem of automatically generating the trajectories for
               several cooperating robot arms to manipulate a movable object
               to a goal location among obstacles. The planner must reason
               that the robots may need to change their grasp of the object
               to complete the task, for example, by passing it from one arm
               to another. Furthermore, the computed velocities and
               accelerations of the arms must satisfy the limits of the
               actuators. Past work strongly suggests that solving this
               problem in a rigorous fashion is intractable.
               We address this problem in a practical two-phase approach. In
               step one, using a heuristic we compute a collision-free path
               for the robots and the movable object. For the case of
               multiple robot arms with many degrees of freedom, this step
               may fail to find the desired path even though it exists.
               Despite this limitation, experimental results of the
               implemented planner (for solving step one) show that it is
               efficient and reliable; for example, the planner is able to
               find complex manipulation motions for a system with seventy
               eight degrees of freedom. In step two, we then find the
               time-parameterization of the path such that the dynamic
               constraints on the robot are satisfied. In fact, we find the
               time-optimal solution for the given path. We show simulation
               results for various complex examples.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1530/CS-TR-94-1530.pdf

%R CS-TR-94-1531
%Z Thu, 08 Dec 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On-Line Manipulation Planning for Two Robot Arms in a Dynamic
               Environment
%A Li, Tsai-Yen
%A Latombe, Jean-Claude
%D December 1994
%X In a constantly changing and partially unpredictable
               environment, robot motion planning must be on-line. The
               planner receives a continuous flow of information about
               occurring events and generates new plans, while previously
               planned motions are being executed. This paper describes an
               on-line planner for two cooperating arms whose task is to
               grab parts of various types on a conveyor belt and transfer
               them to their respective goals while avoiding collision with
               obstacles. Parts arrive on the belt in random order, at any
               time. Both goals and obstacles may be dynamically changed.
               This scenario is typical of manufacturing cells serving
               machine-tools, assembling products, or packaging objects. The
               proposed approach breaks the overall planning problem into
               subproblems, each involving a low-dimensional configuration
               or configuration-time space, and orchestrates very fast
               primitives solving these subproblems. The resulting planner
               has been implemented and extensively tested in a simulated
               environment, as well as with a real dual-arm system. Its
               competitiveness has been evaluated against an oracle making
               (almost) the best decision at any one time; the results show
               that the planner compares extremely well.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1531/CS-TR-94-1531.pdf

%R CS-TR-94-1533
%Z Thu, 08 Dec 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Randomized Query Processing in Robot Motion Planning
%A Raghavan, L. Kavraki, J-C. Latombe, R. Motwani, P.
%D December 1994
%X The subject of this paper is the analysis of a randomized
               preprocessing scheme that has been used for query processing
               in robot motion planning. The attractiveness of the scheme
               stems from its general applicability to virtually any
               motion-planning problem, and its empirically observed
               success. In this paper we initiate a theoretical basis for
               explaining this empirical success. Under a simple assumption
               about the configuration space, we show that it is possible to
               perform a preprocessing step following which queries can be
               answered quickly. En route, we pose and give solutions to
               related problems on graph connectivity in the evasiveness
               model, and art-gallery theorems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1533/CS-TR-94-1533.pdf

%R CS-TR-94-1522
%Z Thu, 08 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Compositional Verification of Reactive and Real-time Systems
%A Chang, Edward
%D December 1993
%X This thesis presents a compositional methodology for the
               verification of reactive and real-time systems. The
               correctness of a given system is established from the
               correctness of the system's components, each of which may be
               treated as a system itself and further reduced. When no
               further reduction is possible or desirable, global techniques
               for verification may be used to verify the bottom-level
               components.
               Transition modules are introduced as a suitable compositional
               model of computation. Various composition operations are
               defined on transition modules, including parallel
               composition, sequential composition, and iteration. A
               restricted assumption-guarantee style of specification is
               advocated, wherein the environment assumption is stated as a
               restriction on the environment's next-state relation.
               Compositional proof rules are provided in accordance with the
               safety-progress hierarchy of temporal properties.
               The compositional framework is then extended naturally to
               real-time transition modules and discrete-time metric
               temporal logic.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1522/CS-TR-94-1522.pdf

%R CS-TR-94-1532
%Z Thu, 08 Dec 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Planning the Collision-Free Paths of an Actively Flexible
               Manipulator
%A Banon, Jose
%D December 1994
%X Most robot manipulators consist of a small sequence of rigid
               links connected by articulated joints. However, robot
               dexterity is considerably enhanced when the number of joints
               is large or infinite. Additional joints make it possible to
               manipulate objects in cluttered environments where
               non-redundant robots are useless. In this paper we consider a
               simulated actively flexible manipulator (AFM), i.e. a
               manipulator whose flexibility can be directly controlled by
               its actuators. We propose an efficient method for planning
               the collision-free paths of an AFM in a three-dimensional
               workspace. We implemented this method on a graphic
               workstation and experimented with it on several examples.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/94/1532/CS-TR-94-1532.pdf

%R CS-TR-91-1350
%Z Thu, 01 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming and problem solving seminar.
%A Chang, Edward
%A Phillips, Steven J.
%A Ullman, Jeffrey D.
%D February 1991
%X This report contains transcripts of the classroom discussions
               of Stanford's Computer Science problem solving course for
               Ph.D. students, CS304, during Winter quarter 1990, and the
               first CS204 class for undergraduates, in the Spring of 1990.
               The problems, and the solutions offered by the classes, span
               a large range of ideas in computer science. Since they
               constitute a study both of programming and research
               paradigms, and of the problem solving process, these notes
               may be of interest to students of computer science, as well
               as computer science educators.
               The present report is the ninth in a series of such
               transcripts, continuing the tradition established in
               STAN-CS-77-606 (Michael J. Clancy, 1977), STAN-CS-79-707
               (Chris Van Wyk, 1979), STAN-CS-81-863 (Allan A. Miller,
               1981), STAN-CS-83-989 (Joseph S. Weening, 1983),
               STAN-CS-83-990 (John D. Hobby, 1983), STAN-CS-85-1055 (Ramsey
               W. Haddad, 1985), STAN-CS-87-1154 (Tomas G. Rokicki, 1987),
               and STAN-CS-89-1269 (Kenneth A. Ross, 1989).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1350/CS-TR-91-1350.pdf

%R CS-TR-91-1351
%Z Thu, 01 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sequence vs. pipeline parallel multiple joins in Paradata
%A Z hu, Liping
%A Keller, Arthur M.
%A Wiederhold, Gio
%D February 1991
%X In this report we analyze and compare hash-join based
               parallel multi-join algorithms for sequenced and pipelined
               processing. The BBN Butterfly machine serves as the host for
               the performance analysis. The sequenced algorithm handles the
               multiple join operations in a conventional sequenced manner,
               except that it distributes the work load of each operation
               among all processors. The pipelined algorithms handle the
               different join operations in parallel, by dividing the
               processors into several groups, with the data flowing through
               these groups.
               The detailed timing tests revealed the bus/memory contention
               that grows linearly with the number of processors. The
               existence of such a contention leads to an optimal region for
               the number of processors, given the join operands fixed. We
               present the analytical and experimental formulae for both
               algorithms, which incorporate this contention. We discuss the
               way of finding an optimal point, and give the heuristics for
               choosing the best processor's partition in pipelined
               processing.
               The study shows that the pipelined algorithms produce the
               first joined result sooner than the sequenced algorithm and
               need less memory to store the intermediate result. The
               sequenced algorithm, on the other hand, takes less time to
               finish the whole join operations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1351/CS-TR-91-1351.pdf

%R CS-TR-91-1359
%Z Thu, 01 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The benefits of relaxing punctuality
%A Alur, Rajeev
%A Feder, Tomas
%A Henzinger, Thomas A.
%D May 1991
%X The most natural, compositional way of modeling real-time
               systems uses a dense domain for time. The satisfiability of
               real-time constraints that are capable of expressing punctuality
               in this model is, however, known to be undecidable. We introduce
               a temporal language that can constrain the time difference
               between events only with finite (yet arbitrary) precision and
               show the resulting logic to be EXPSPACE-complete. This result
               allows us to develop an algorithm for the verification of timing
               properties of real-time systems with a dense semantics.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1359/CS-TR-91-1359.pdf

%R CS-TR-91-1360
%Z Thu, 01 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sooner is safer than later.
%A Henzinger, Thomas A.
%D May 1991
%X It has been repeatedly observed that the standard
               safety-liveness classification of properties of reactive
               systems does not fit for real-time properties. This is
               because the implicit "liveness" of time shifts the spectrum
               towards the safety side. While, for example, response--that
               "something good" will happen, eventually--is a classical
               liveness property, bounded response--that "something good"
               will happen soon, within a certain amount of time--has many
               characteristics of safety. We account for this phenomenon
               formally by defining safety and liveness relative to a given
               condition, such as the progress of time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1360/CS-TR-91-1360.pdf

%R CS-TR-91-1369
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Approximating matchings in parallel
%A Fischer, Ted
%A Goldberg, Andrew V.
%A Plotkin, Serge
%D June 1991
%X We show that for any constant k > O, a matching with
               cardinality at least 1 - 1/(k+1) times the maximum can be
               computed in NC.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1369/CS-TR-91-1369.pdf

%R CS-TR-91-1370
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An NQTHM mechanization of "An Exercise in the Verification of
               Multi-Process Programs"
%A Nagayama, Misao
%A Talcott, Carolyn
%D June 1991
%X This report presents a formal verification of the local
               correctness of a mutex algorithm using the Boyer-Moore
               theorem prover. The formalization follows closely an informal
               proof of Manna and Pnuelli. The proof method of Manna and
               Pnueli is to first extract from the program a set of states
               and induced transition system. One then proves suitable
               invariants. There are two variants of the proof. In the first
               (atomic) variant, compound tests involving quantification
               over a finite set are viewed as atomic operations. In the
               second (molecular) variant, this assumption is removed,
               making the details of the transitions and proof somewhat more
               complicated.
               The original Manna-Pnueli proof was formulated in terms of
               finite sets. This led to concise and elegant informal proof,
               however one that is not easy to mechanize in the Boyer-Moore
               logic. In the mechanized version we use a dual isomorphic
               representation of program states based on finite sequences.
               Our approach was to outline the formal proof of each
               invariant, making explicit the case analyses, assumptions and
               properties of operations used. The outline served as our
               guide in developing the formal proof. The resulting sequence
               of events follows the informal plan quite closely. The main
               difficulties encountered were in discovering the precise form
               of the lemmas and hints necess to guide the theorem prover.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1370/CS-TR-91-1370.pdf

%R CS-TR-91-1374
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Polynomial dual network simplex algorithms
%A Orlin, James B.
%A Plotkin, Serge A.
%A Tardos, Eva
%D August 1991
%X We show how to use polynomial and strongly polynomial
               capacity scaling algorithms for the transshipment problem to
               design a polynomial dual network simplex pivot rule. Our best
               pivoting strategy leads to an O(m2 log n) bound on the number
               of pivots, where n and m denotes the number of nodes and arcs
               in the input network. If the demands are integral and at most
               B, we also give an O(m(m + n log n) min(log nB, m log
               n))-time implementation of a strategy that requires somewhat
               more pivots.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1374/CS-TR-91-1374.pdf

%R CS-TR-91-1375
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast approximation algorithms for multicommodity flow
               problems
%A Leighton, Tom
%A Makedon, Fillia
%A Plotkin, Serge
%A Stein, Clifford
%A Tardos, Eva
%A Tragoudas, Spyros
%D August 1991
%X In this paper, we describe the first polynomial-time
               combinatorial algorithms for approximately solving the
               multicommodity flow problem. Our algorithms are significantly
               faster than the best previously known algorithms, that were
               based on linear programming. For a k-commodity multicommodity
               flow problem, the running time of our randomized algorithm is
               (up to log factors) the same as the time needed to solve k
               single-commodity flow problems, thus giving the surprising
               result that approximately computing a k-commodity
               maximum-flow is not much harder than computing about k
               single-commodity maximum-flows in isolation. Given any
               multicommodity flow problem as input, our algorithm is
               guaranteed to provide a feasible solution to a modified flow
               problem in which all capacities are increased by a (1 +
               epsilon)-factor, or to provide a proof that there is no
               feasible solution to the original problem.
               We also describe faster approximation algorithms for
               multicommodity flow problems with a special structure, such
               as those that arise in the "sparsest cut" problems and the
               uniform concurrent flow problems if k <= the square root of
               m.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1375/CS-TR-91-1375.pdf

%R CS-TR-91-1377
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An evaluation of left-lookikng, right-looking and
               multifrontal approaches to sparse Cholesky factorization on
               hierarchical memory machines
%A Rothberg, Edward
%A Gupta, Anoop
%D August 1991
%X In this paper we present a comprehensive analysis of the
               performance of a variety of sparse Cholesky factorization
               methods on hierarchical-memory machines. We investigate
               methods that vary along two different axes. Along the first
               axis, we consider three different high-level approaches to
               sparse factorization: left-looking, right-looking, and
               multifrontal. Along the second axis, we consider the
               implementation of each of these high-level approaches using
               different sets of primitives. The primitives vary based on
               the structures they manipulate. One important structure in
               sparse Cholesky factorization is a single column of the
               matrix. We first consider primitives that manipulate single
               columns. These are the most commonly used primitives for
               expressing the sparse Cholesky computation. Another important
               structure is the supernode, a set of columns with identical
               non-zero structures. We consider sets of primitives that
               exploit the supemodal structure of the matrix to varying
               degrees. We find that primitives that manipulate larger
               structures greatly increase the amount of exploitable data
               reuse, thus leading to dramatically higher perfommance on
               hierarchical-memory machines. We observe performance
               increases of two to three times when comparing methods based
               on primitives that make extensive use of the supernodal
               structure to methods based on primitives that manipulate
               columns. We also find that the overall approach
               (left-looking, right-looking, or multifrontal) is less
               important for performance than the particular set of
               primitives used to implement the approach.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1377/CS-TR-91-1377.pdf

%R CS-TR-91-1381
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Implementing hypertext database relationships through
               aggregations and exceptions
%A Hara, Yoshinori
%A Keller, Arthur M.
%A Rathmann, Peter K.
%A Wiederhold, Gio
%D September 1991
%X In order to combine hypertext with database facilities, we
               show how to extract an effective storage structure from given
               instance relationships. The schema of the structure
               recognizes clusters and exceptions. Extracting high-level
               structures is useful for providing a high performance
               browsing environment as well as efficient physical database
               design, especially when handling large amounts of data.
               This paper focuses on a clustering method, ACE, which
               generates aggregations and exceptions from the original graph
               structure in order to capture high level relationships. The
               problem of minimizing the cost function is NP-complete. We
               use a heuristic approach based on an extended Kernighan-Lin
               algorithm.
               We demonstrate our method on a hypertext application and on a
               standard random graph, compared with its analytical model.
               The storage reductions of input database size in main memory
               were 77.2% and 12.3%, respectively. It was also useful for
               secondary storage organization for efflcient retrieval.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1381/CS-TR-91-1381.pdf

%R CS-TR-91-1383
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Temporal proof methodologies for real-time systems
%A Henzinger, Thomas A.
%A Manna, Z ohar
%A Pnueli, Amir
%D September 1991
%X We extend the specification language of temporal logic, the
               corresponding verification framework, and the underlying
               computational model to deal with real-time properties of
               reactive systems. The abstract notion of timed transition
               systems generalizes traditional transition systems
               conservatively: qualitative fairness requirements are
               replaced (and superseded) by quantitative lower-bound and
               upper-bound timing constraints on transitions. This framework
               can model real-time systems that communicate either through
               shared variables or by message passing and real-time issues
               such as time-outs, process priorities (interrupts), and
               process scheduling.
               We exhibit two styles for the specification of real-time
               systems. While the first approach uses bounded versions of
               temporal operators, the second approach allows explicit
               references to time through a special clock variable.
               Corresponding to the two styles of specification, we present
               and compare two fundamentally different proof methodologies
               for the verification of timing requirements that are
               expressed in these styles. For the bounded-operatoT style, we
               provide a set of proof rules for establishing
               bounded-invariance and bounded-response properties of timed
               transition systems. This approach generalizes the standard
               temporal proof rules for verifying invariance and response
               properties conservatively. For the explicit-clock style, we
               exploit the observation that every time-bounded property is a
               safety property and use the standard temporal proof rules for
               establishing safety properties.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1383/CS-TR-91-1383.pdf

%R CS-TR-91-1387
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Assembling polyhedra with single translations
%A Wilson, Randall
%A Schweikard, Achim
%D October 1991
%X The problem of partitioning an assembly of polyhedral objects
               into two subassemblies that can be separated arises in
               assembly planning. We describe an algorithm to compute the
               set of all translations separating two polyhedra with n
               vertices in O(n4) steps and show that this is optimal. Given
               an assembly of k polyhedra with a total of n vertices, an
               extension of this algorithm identifies a valid translation
               and removable subassembly in O(k2 n4) steps if one exists.
               Based on the second algorithm a polynomial time method for
               finding a complete assembly sequence consisting of single
               translations is derived. An implementation incorporates
               several changes to achieve better average-case performance;
               experimental results obtained for composite objects
               consisting of isothetic polyhedra are described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1387/CS-TR-91-1387.pdf

%R CS-TR-91-1389
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The AGENT0 manual
%A Torrance, Mark C.
%A Viola, Paul A.
%D April 1991
%X This document describes an implementation of AOP, an
               interpreter for programs written in a language called AGENTO.
               AGENTO is a first stab at a programming language for the
               paradigm of Agent-Oriented Programming. It is currently under
               development at Stanford under the direction of Yoav Shoham.
               This implementation is the work of Paul A. Viola of MIT and
               Mark C. Torrance of Stanford.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1389/CS-TR-91-1389.pdf

%R CS-TR-91-1391
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A logic for perception and belief
%A Shoham, Yoav
%A del Val, Alvaro
%D September 1991
%X We present a modal logic for reasoning about perception and
               belief, captured respectively by the operators P and B. The B
               operator is the standard belief operator used in recent
               years, and the P operator is similarly defined. The
               contribution of the paper is twofold. First, in terms of P we
               provide a definition of perceptual indistinguishability, such
               as arises out of limited visual acuity. The definition is
               concise, intuitive (we find), and avoids traditional
               paradoxes. Second, we explore the bimodal B--P system. We
               argue that the relationship between the two modalities varies
               among settings: The agent may or may not have confidence in
               its perception, may or may not be accurate in it, and so on.
               We therefore define a number of agent types corresponding to
               these various assumptions, and for each such agent type we
               provide a sound and complete axiomatization of the B--P
               system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1391/CS-TR-91-1391.pdf

%R CS-TR-91-1392
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A classification of update methods for replicated databases
%A Ceri, Stefano
%A Houtsma, Maurice A. W.
%A Keller, Arthur M.
%A Samarati, Pierangela
%D October 1991
%X In this paper we present a classification of the methods for
               updating replicated databases. The main contribution of this
               paper is to present the various methods in the context of a
               structured taxonomy, which accommodates very heterogeneous
               methods. Classes of update methods are presented through
               their general properties, such as the invariants that hold
               for them. Methods are reviewed both in their normal and
               abnormal behaviour (e.g., after a network partition).
               We show that several methods presented in the literature,
               sometimes in independent papers with no cross-reference, are
               indeed very much related, for instance because they share the
               same basic technique. We also show in what sense they diverge
               from the basic technique. This classification can serve as a
               basis for choosing the method that is most suitable to a
               specific application. It can also be used as a guideline to
               researchers who aim at developing new mechanisms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1392/CS-TR-91-1392.pdf

%R CS-TR-91-1394
%Z Fri, 02 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Application-controlled physical memory using external
               page-cache management
%A Harty, Kieran
%A Cheriton, David R.
%D October 1991
%X Next generation computer systems will have gigabytes of
               physical memory and processors in the 100 MIPS range or
               higher. Contrary to some conjectures, this trend requires
               more sophisticated memory management support for memory-bound
               computations such as scientific simulations and systems such
               as large-scale database systems, even though memory
               management for most programs will be less of a concern.
               We describe the design, implementation and evaluation of a
               virtual memory system that provides application control of
               physical memory using external page-cache management. In this
               approach, a sophisticated application is able to monitor and
               control the amount of physical memory it has available for
               execution, the exact contents of this memory, and the
               scheduling and nature of page-in and page-out using the
               abstraction of a physical page cache provided by the kernel.
               We claim that this approach can significantly improve
               performance for many memory-bound applications while reducing
               kernel complexity, yet does not complicate other applications
               or reduce their performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/91/1394/CS-TR-91-1394.pdf

%R CS-TR-89-1267
%Z Tue, 27 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A really temporal logic.
%A Alur, Rajeev
%A Henzinger, Thomas A.
%D July 1989
%X We introduce a real-time temporal logic for the specification
               of reactive systems. The novel feature of our logic, TPTL, is
               the adoption of temporal operators as quantifiers over time
               variables; every modality binds a variable to the time(s) it
               refers to.
               TPTL is demonstrated to be both a natural specification
               language as well as a suitable formalism for verification and
               synthesis. We present a tableau-based decision procedure and
               model-checking algorithm for TPTL. Several generalizations of
               TPTL are shown to be highly undecidable.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1267/CS-TR-89-1267.pdf

%R CS-TR-89-1248
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Efficiency of the network simplex algorithm for the maximum
               flow problem
%A Goldberg, Andrew V.
%A Grigoriadis, Michael D.
%A Tarjan, Robert E.
%D February 1989
%X Goldfarb and Hao have proposed a network simplex algorithm
               that will solve a maximum flow problem on an n-vertex, m-arc
               network in at most nm pivots and O(n2m) time. In this paper
               we describe how to implement their algorithm to run in O(nm
               log n) time by using an extension of the dynamic tree data
               structure of Sleator and Tarjan. This bound is less than a
               logarithmic factor larger than that of any other known
               algorithm for the problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1248/CS-TR-89-1248.pdf

%R CS-TR-89-1250
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A sound and complete axiomatization of operational
               equivalence between programs with memory
%A Mason, Ian
%A Talcott, Carolyn
%D March 1989
%X In this paper we present a formal system for deriving
               assertions about programs with memory. The assertions we
               consider are of the following three forms: (i) e diverges
               (i.e. fails to reduce to a value), written $\arru e$; (ii)
               $e_O$ and $e_1$ reduce to the same value and have exactly the
               same effect on memory, written $e_O \bksimlr e_1$; and (iii)
               $e_O$ and $e_1$ reduce to the same value and have the same
               effect on memory up to production of garbage (are strongly
               isomorphic), written $_O \bksimeq e_1$. The e, $e_j$ are
               expressions of a first-order Scheme- or Lisp-like language
               with the data operations atom, eq, car, cdr, cons, setcar,
               setcdr, the control primitives let and if, and recursive
               definition of function symbols.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1250/CS-TR-89-1250.pdf

%R CS-TR-89-1255
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T METAFONTware
%A Knuth, Donald E.
%A Rokicki, Tomas G.
%A Samuel, Arthur L.
%D May 1989
%X This report contains the complete WEB documentation for four
               utility programs that are often used in conjunction with
               METAFONT: GFtype, GFtoPK, GFtoDVI, and MFT.
               This report is analogous to TeXware, published in 1986
               (STAN-CS-86-1097). METAFONTware completes the set.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1255/CS-TR-89-1255.pdf

%R CS-TR-89-1259
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Interior-point methods in parallel computation
%A Goldberg, Andrew V.
%A Plotkin, Serge A.
%A Shmoys, David B.
%A Tardos, Eva
%D May 1989
%X ln this paper we use interior-point methods for linear
               programing, developed in the context of sequential
               computation, to obtain a parallel algorithm for the bipartite
               matching problem. Our algorithm runs in $O^n$(SQRT m) time.
               Our results extend to the weighted bipartite matching problem
               and to the zero-one minimum-cost flow problem, yielding
               $O^n$((SQRT m) log C) algorithms. This improves previous
               bounds on these problems and illustrates the importance of
               interior-point methods in the context of parallel algorithm
               design.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1259/CS-TR-89-1259.pdf

%R CS-TR-89-1261
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Pipelined parallel computations, and sorting on a pipelined
               hypercube.
%A Mayr, Ernst W.
%A Plaxton, C. Greg
%D May 1989
%X This paper brings together a number of previously known
               techniques in order to obtain practical and efficient
               implementations of the prefix operation for the complete
               binary tree, hypercube and shuffle exchange families of
               networks. For each of these networks, we also provide a
               "pipelined" scheme for performing k prefix operations in O(k
               + log p) time on p processors. This implies a similar
               pipelining result for the "data distribution" operation of
               Ullman [16]. The data distribution primitive leads to a
               simplified implementation of the optimal merging algorithm of
               Varman and Doshi, which runs on a pipelined model of the
               hypercube [17]. Finally, a pipelined version of the multi-way
               merge sort of Nassimi and Sahni [10], running on the
               pipelined hypercube model, is described. Given p processors
               and n < p log p values to be sorted, the running time of the
               pipelined algorithm is O(log2 p/log((p log p)/n)). Note that
               for the interesting case n = p this yields a running time of
               0(log2 p/log log p), which is asymptotically faster than
               Batcher's bitonic sort[3].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1261/CS-TR-89-1261.pdf

%R CS-TR-89-1264
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Chebyshev polynomials are not always optimal
%A Fischer, Bernd
%A Freund, Roland
%D June 1989
%X We are concerned with the problem of finding among all
               polynomials of degree at most n and normalized to be 1 at c
               the one with minimal uniform norm on Epsilon. Here, Epsilon
               is a given ellipse with both foci on the real axis and c is a
               given real point not contained in Epsilon. Problems of this
               type arise in certain iterative matrix computations, and, in
               this context, it is generally believed and widely referenced
               that suitably normalized Chebyshev polynomials are optimal
               for such constrained approximation problems. In this note, we
               show that this is not true in general. Moreover, we derive
               sufficient conditions which guarantee that Chebyshev
               polynomials are optimal. Also, some numerical examples are
               presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1264/CS-TR-89-1264.pdf

%R CS-TR-89-1266
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Multi-level shared caching techniques for scalability in
               VMP-MC
%A Cheriton, David R.
%A Goosen, Hendrik A.
%A Boyle, Patrick D.
%D May 1989
%X The problem of building a scalable shared memory
               multiprocessor can be reduced to that of building a scalable
               memory hierarchy, assuming interprocessor communication is
               handled by the memory system. In this paper, we describe the
               VMP-MC design, a distributed parallel multi-computer based on
               the VMP multiprocessor design, that is intended to provide a
               set of building blocks for configuring machines from one to
               several thousand processors. VMP-MC uses a memory hierarchy
               based on shared caches, ranging from on-chip caches to
               board-level caches connected by busses to, at the bottom, a
               high-speed fiber optic ring. In addition to describing the
               building block components of this architecture, we identify
               the key performance issues associated with the design and
               provide performance evaluation of these issues using
               trace-drive simulation and measurements from the VMP.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1266/CS-TR-89-1266.pdf

%R CS-TR-89-1268
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Addition machines
%A Floyd, Robert W.
%A Knuth, Donald E.
%D July 1989
%X An addition machine is a computing device with a finite
               number of registers, limited to the following six types of
               operations:
                   read x          {input to register x}
                   x <-- y         {copy register y to register x}
                   x <-- x + y     {add register y to register x}
                   x <-- x - y     {subtract register y from register x}
                   if x >= y       {compare register x to register y}
                   write x         {output from register x}
               The register contents are assumed to belong to a given set A,
               which is an additive subgroup of the real numbers. If A is
               the set of all integers, we say the device is an integer
               addition machine; if A is the set of all real numbers, we
               say the device is a real addition machine.
               We will consider how efficiently an integer addition machine
               can do operations such multiplication, division, greatest
               common divisor, exponentiation, and sorting. We will also
               show that any addition machine with at least six registers
               can compute the ternary operation x[y/z] with reasonable
               efficiency, given x, y, z in A with z not equal to 0.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1268/CS-TR-89-1268.pdf

%R CS-TR-89-1269
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming and problem solving seminar
%A Ross, Kenneth A.
%A Knuth, Donald E.
%D July 1989
%X This report contains edited transcripts of the discussions
               held in Stanford's Computer Science problem solving course,
               CS304, during winter quarter 1989. Since the topics span a
               large range of ideas in computer science, and since most of
               the important research paradigms and programming paradigms
               were touched on during the discussions, these notes may be of
               interest to graduate students of computer science at other
               universities, as well as to their professors and to
               professional people in the "real world."
               The present report is the eighth in a series of such
               transcripts, continuing the tradition established in
               STAN-CS-77-606 (Michael J. Clancy, 1977), STAN-CS-79-707
               (Chris Van Wyk, 1979), STAN-CS-81-863 (Allan A. Miller,
               1981), STAN-CS-83-989 (Joseph S. Weening, 1983),
               STAN-CS-83-990 (John D. Hobby, 1983), STAN-CS-85-1055 (Ramsey
               W. Haddad, 1985) and STAN-CS-87-1154 (Tomas G. Rokicki,
               1987).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1269/CS-TR-89-1269.pdf

%R CS-TR-89-1273
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sirpent[TM]: a high-performance internetworking approach
%A Cheriton, David R.
%D July 1989
%X A clear target for computer communication technology is to
               support a high-performance global internetwork. Current
               internetworking approaches use either concatenated virtual
               circuits, as in X.75, or a "universal" internetwork datagram,
               as in the DoD Internet IP protocol and the IS0 connectionless
               network protocol (CLNP). Both approaches have significant
               disadvantages.
               This paper describes Sirpent[TM] (Source Internetwork Routing
               Protocol with Extended Network Transfer), a new approach to
               an internetwork architecture that makes source routing the
               basis for interconnection, rather than an option as in IP.
               Its benefits include simple switching with low per-packet
               processing and delay, support for accounting and congestion
               control, and scalability to a global internetwork. It also
               supports flexible, user-controlled routing such as required
               for security, policy-based routing and real-time
               applications. We also propose a specific internetwork
               protocol, called VIPER[TM], as a realization of the Sirpent
               approach.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1273/CS-TR-89-1273.pdf

%R CS-TR-89-1275
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A new approach to stable matching problems
%A Subramanian, Ashok
%D August 1989
%X We show that Stable Matching problems are the same as
               problems about stable configurations of X-networks.
               Consequences include easy proofs of old theorems, a new
               simple algorithm for finding a stable matching, an
               understanding of the difference between Stable Marriage and
               Stable Roommates, NP-completeness of Three-party Stable
               Marriage, CC-completeness of several Stable Matching
               problems, and a fast parallel reduction from the Stable
               Marriage problem to the Assignment problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1275/CS-TR-89-1275.pdf

%R CS-TR-89-1276
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the network complexity of selection
%A Plaxton, C. Greg
%D August 1989
%X The selection problem is to determine the kth largest out of
               a given set of n keys, and its sequential complexity is well
               known to be linear. Thus, given a p processor parallel
               machine, it is natural to ask whether or not an O(n/p)
               selection algorithm can be devised for that machine. For the
               EREW PRAM, Vishkin has exhibited a straightforward selection
               algorithm that achieves optimal speedup for n = Omega(p log p
               log log p) [18]. For the network model, the sorting result of
               Leighton [12] and the token distribution result of Peleg and
               Upfal [13] together imply that Vishkin's algorithm can be
               adapted to run in the same asymptotic time bound on a certain
               class of bounded degree expander networks. On the other hand,
               none of the network families currently of practical interest
               have sufficient expansion to permit an efficient
               implementation of Vishkin's algorithm.
               The main result of this paper is an Omega((n/p) log log p +
               log p) lower bound for selection on any network that
               satisfies a particular low expansion property. The class of
               networks satisfying this property includes all of the common
               network families such as the tree, multi-dimensional mesh,
               hypercube, butterfly and shuffle exchange. When n/p is
               sufficiently large (for example, greater than log2 p on the
               butterfly, hypercube and shuffle exchange), this result is
               matched by the upper bound presented in [14].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1276/CS-TR-89-1276.pdf

%R CS-TR-89-1278
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The complexity of circuit value and network stability
%A Mayr, Ernst W.
%A Subramanian, Ashok
%D August 1989
%X We develop a method for non-trivially restricting fanout in a
               circuit. We study the complexity of the Circuit Value problem
               and a new problem, Network Stability, when fanout is limited.
               This leads to new classes of problems within P. We conjecture
               that the new classes are different from P and incomparable to
               NC. One of these classes, CC, contains several natural
               complete problems, including Circuit Value for comparator
               circuits, Lex-first Maximal Matching, and problems related to
               Stable Marriage and Stable Roommates.
               When fanout is appropriately limited, we get positive
               results: a parallel algorithm for Circuit Value that runs in
               time about the square root of the number of gates, a
               linear-time sequential algorithm for Network Stability, and
               logspace reductions between Circuit Value and Network
               Stability.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1278/CS-TR-89-1278.pdf

%R CS-TR-89-1280
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sticky bits and universality of consensus
%A Plotkin, Serge A.
%D August 1989
%X In this paper we consider implementation of atomic wait-free
               objects in the context of a shared-memory multiprocessor. We
               introduce a new primitive object, the "Sticky-Bit", and show
               its universality by proving that any safe implementation of a
               sequential object can be transformed into a wait-free atomic
               one using only Sticky Bits and safe registers.
               The Sticky Bit may be viewed as a memory-oriented version of
               consensus. In particular, the results of this paper imply
               "universality of consensus" in the sense that given an
               algorithm to achieve n-processor consensus, we can transform
               any safe implementation of a sequential object into a
               wait-free atomic one using polynomial number of additional
               safe bits.
               The presented results also imply that the Read-Modify-Write
               (RMW) hierarchy "collapses". More precisely, we show that
               although an object that supports a 1-bit atomic wait-free RMW
               is strictly more powerful than safe register and an object
               that supports 3-valued atomic wait-free RMW is strictly more
               powerful than 1-bit RMW, the 3-value RMW is universal in the
               sense that any RMW can be atomically implemented from a
               3-value atomic RMW in a wait-free fashion.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1280/CS-TR-89-1280.pdf

%R CS-TR-89-1281
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Load balancing on the hypoercube and shuffle-exchange
%A Plaxton, C. Greg
%D August 1989
%X Maintaining a balanced load is of fundamental importance on
               any parallel computer, since a strongly imbalanced load often
               leads to low processor utilization. This paper considers two
               load balancing operations: Balance and MultiBalance. The
               Balance operation corresponds to the token distribution
               problem considered by Peleg and Upfal [9] for certain
               expander networks. The MultiBalance operation balances
               several populations of distinct token types simultaneously.
               Efficient implementations of these operations will be given
               for the hypercube and shuffle-exchange, along with tight or
               near-tight lower bounds.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1281/CS-TR-89-1281.pdf

%R CS-TR-89-1286
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast sparse matrix factorization on modern workstations
%A Rothberg, Edward
%A Gupta, Anoop
%D October 1989
%X The performance of workstation-class machines has experienced
               a dramatic increase in the recent past. Relatively
               inexpensive machines which offer 14 MIPS and 2 MFLOPS
               performance are now available, and machines with even higher
               performance are not far off. One important characteristic of
               these machines is that they rely on a small amount of
               high-speed cache memory for their high performance. In this
               paper, we consider the problem of Cholesky factorization of a
               large sparse positive definite system of equations on a high
               performance workstation. We find that the major factor
               limiting performance is the cost of moving data between
               memory and the processor. We use two techniques to address
               this limitation; we decrease the number of memory references
               and we improve cache behavior to decrease the cost of each
               reference. When run on benchmarks from the Harwell-Boeing
               Sparse Matrix Collection, the resulting factorization code is
               almost three times as fast as SPARSPAK on a DECStation 3100.
               We believe that the issues brought up in this paper will play
               an important role in the effective use of high performance
               workstations on large numerical problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1286/CS-TR-89-1286.pdf

%R CS-TR-89-1290
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reading list for the Qualifying Examination in Artificial
               Intelligence
%A Myers, Karen
%A Subramanian, Devika
%A Z abih, Ramin
%D November 1989
%X This report contains the reading list for the Qualifying
               Examination in Artificial Intelligence. Areas covered include
               search, representation, reasoning, planning and problem
               solving, learning, expert systems, vision, robotics, natural
               language, perspectives and AI programming. An extensive
               bibliography is also provided.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1290/CS-TR-89-1290.pdf

%R CS-TR-89-1296
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Completing the temporal picture
%A Manna, Z ohar
%A Pnueli, Amir
%D December 1989
%X The paper presents a relatively complete proof system for
               proving the validity of temporal properties of reactive
               programs. The presented proof system improves oll previous
               temporal systems, such as [MP83a] and [MP83b], in that it
               reduces the validity of program properties into pure
               assertional reasoning, not involving additional temporal
               reasoning. The proof system is based on the classification of
               temporal properties according to the Borel hierarchy,
               providing an appropriate proof rule for each of the main
               classes, such as safety, response, and progress properties.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1296/CS-TR-89-1296.pdf

%R CS-TR-89-1288
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Programming and proving with function and control
               abstractions
%A Talcott, Carolyn
%D October 1989
%X Rum is an intensional semantic theory of function and control
               abstractions as computation primitives. It is a mathematical
               foundation for understanding and improving current practice
               in symbolic (Lisp-style) computation. The theory provides, in
               a single context, a variety of semantics ranging from
               structures and rules for carrying out computations to an
               interpretation as functions on the computation domain.
               Properties of powerful programming tools such as functions as
               values, streams, aspects of object oriented programming,
               escape mechanisms, and coroutines can be represented
               naturally. In addition a wide variety of operations on
               programs can be treated including program transformations
               which introduce function and control abstractions, compiling
               morphisms that transform control abstractions into function
               abstractions, and operations that transform intensional
               properties of programs into extensional properties. The
               theory goes beyond a theory of functions computed by
               programs, providing tools for treating both intensional and
               extensional properties of programs. This provides operations
               on programs with meanings to transform as well as meanings to
               preserve. Applications of this theory include expressing and
               proving properties of particular programs and of classes of
               programs and studying mathematical properties of computation
               mechanisms. Additional applications are the design and
               implementation of interactive computation systems and the
               mechanization of reasoning about computation.
               These notes are based on lectures given at the Western
               Institute of Computer Science summer program, 31 July - 1
               August 1986. Here we focus on programming and proving with
               function and control abstractions and present a variety of
               example programs, properties, and techniques for proving
               these properties.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1288/CS-TR-89-1288.pdf

%R CS-TR-89-1244
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Software Performance on Nonlinear Least-Squares Problems
%A Fraley, Christina
%D January 1989
%X This paper presents numerical results for a large and varied
               set of problems using sofware that is widely available and
               has undergone extensive testing. The algorithms implemented
               in this software include Newton-based linesearch and
               trust-region methods for unconstrained optimization, as well
               as Gauss-Newton, Levenberg-Marquardt, and special
               quasi-Newton methods for nonlinear least squares. Rather than
               give a critical assessment of the software itself, our
               original purpose was to use the best available software to
               compare the underlying algorithms, to identify classes of
               problems for each method on which the performance is either
               very good or very poor and to provide benchmarks for future
               work in nonlinear least squares and unconstrained
               optimization. The variability in the results made it
               impossible to meet either of the first two goals; however the
               results are significant as a step toward explaining why
               thesse aims are so difficult to accomplish.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/89/1244/CS-TR-89-1244.pdf

%R CS-TR-90-1298
%Z Thu, 22 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Leases: an efficient fault-tolerant mechanism for distributed
               file cache consistency.
%A Gray, Cary G.
%A Cheriton, David R.
%D January 1990
%X Caching introduces the overhead and complexity of ensuring
               consistency, reducing some of its performance benefits. In a
               distributed system, caching must deal with the additional
               complications of commumcation and host failures.
               Leases are proposed as a time-based mechanism that provides
               efficient consistent access to cached data in distributed
               systems. Non-Byzantine failures affect performance, not
               correctness ,with their effect minimized by short leases. An
               analytic model and an evaluation for file access in the V
               system show that leases of short duration provide good
               performance. The impact of leases on performance grows more
               significant in systems of larger scale and higher processor
               performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1298/CS-TR-90-1298.pdf

%R CS-TR-90-1304
%Z Tue, 06 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A model of object-identities and values
%A Matsushima, Toshiyuki
%A Wiederhold, Gio
%D February 1990
%X An algebraic formalization of the object-orlented data model
               is proposed. The formalism reveals that the semantics of the
               object-oriented model consists of two portions. One is
               expressed by an algebraic construct, which has essentially a
               value-oriented semantics. The other is expressed by
               object-identities, which characterize the essential
               difference of the object-oriented model and value-oriented
               models, such as the relational model and the logical database
               model. These two portions are integrated by a simple
               commutativity of modeling functions. The formalism includes
               the expression of integrity constraints in its construct,
               which provides the natural integration of the logical
               database model and the object-oriented database model.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1304/CS-TR-90-1304.pdf

%R CS-TR-90-1305
%Z Tue, 06 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A comparative evaluation of nodal and supernodal parallel
               sparse matrix factorization: detailed simulation results
%A Rathberg, Edward
%A Gupta, Anoop
%D February 1990
%X In this paper we consider the problem of factoring a large
               sparse system of equations on a modestly parallel
               shared-memory multiprocessor with a non-trivial memory
               hierarchy. Using detailed multiprocessor simulation, we study
               the behavior of the parallel sparse factorization scheme
               developed at the Oak Ridge National Laboratory. We then
               extend the Oak Ridge scheme to incorporate the notion of
               supernodal elimination. We present detailed analyses of the
               sources of performance degradation for each of these schemes.
               We measure the impact of interprocessor communication costs,
               processor load imbalance, overheads introduced in order to
               distribute work, and cache behavior on overall parallel
               performance. For the three benchmark matrices which we study,
               we find that the supernodal scheme gives a factor of 1.7 to
               2.7 performance advantage for 8 processors and a factor of
               0.9 to 1.6 for 32 processors. The supemodal scheme exhibits
               higher performance due mainly to the fact that it executes
               many fewer memory operations and produces fewer cache misses.
               However, the natural task grain size for the supernodal
               scheme is much larger than that of the Oak Ridge scheme,
               making effective distnbution of work more difficult,
               especially when the number of processors is large.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1305/CS-TR-90-1305.pdf

%R CS-TR-90-1307
%Z Tue, 06 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Real-time logics: complexity and expressiveness
%A Alur, Rajeev
%A Henzinger, Thomas A.
%D March 1990
%X The theory of the natural numbers with linear order and
               monadic predicates underlies propositional linear temporal
               logic. To study temporal logics for real-time systems, we
               combine this classical theory of infinite state sequences
               with a theory of time, via a monotonic function that maps
               every state to its time. The resulting theory of timed state
               sequences is shown to be decidable, albeit nonelementary, and
               its expressive power is characterized by omega-regular sets.
               Several more expressive variants are proved to be highly
               undecidable.
               This framework allows us to classify a wide variety of
               real-time logics according to their complexity and
               expressiveness. In fact, it follows that most formalisms
               proposed in the literature cannot be decided. We are,
               however, able to identify two elementary real-time temporal
               logics as expressively complete fragments of the theory of
               timed state sequences, and give tableau-based decision
               procedures. Consequently, these two formalisms are
               well-suited for the specification and verification of
               real-time systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1307/CS-TR-90-1307.pdf

%R CS-TR-90-1312
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A validation structure based theory of plan modification and
               reuse
%A Kambhampati, Subbarao
%A Hendler, James A.
%D June 1990
%X A framework for the flexible and conservative modification of
               plans enables a planner to modify its plans in response to
               incremental changes in their specifications, to reuse its
               existing plans in new problem situations, and to efficiently
               replan in response to execution time failures. We present a
               theory of plan modification applicable to hierarchical
               nonlinear planning. Our theory utilizes the validation
               structure of stored plans to yield a flexible and
               conservative plan modification framework. The validation
               structure, which constitutes a hierarchical explanation of
               correctness of the plan with respect to the planner's own
               knowledge of the domain, is annotated on the plan as a
               by-product of initial planning. Plan modification is
               formalized as a process of removing inconsistencies in the
               validation structure of a plan when it is being reused in a
               new (changed) planning situation. The repair of these
               inconsistencies involves removing unnecessary parts of the
               plan and adding new non-primitive tasks to the plan to
               establish missing or failing validations. The resultant
               partially reduced plan (with a consistent validation
               structure) is sent to the planner for complete reduction. We
               discuss the development of this theory in the PRIAR system,
               present an empirical evaluation of this theory, and
               characterize its completeness, coverage, efficiency and
               limitations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1312/CS-TR-90-1312.pdf

%R CS-TR-90-1313
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Book review: Potokovye Algoritmy (Flow Algorithms) by G. M.
               Adel'son-Vel'ski, E. A. Dinic, and A. V. Karzanov.
%A Goldberg, Andrew V.
%A Gusfield, Dan
%D June 1990
%X This is a review of the book "Flow Algorithms" by
               Adel'son-Vel'ski, Dinic, and Karzanov, well-known researchers
               in the area of algorithm design and analysis. This remarkable
               book, published in 1975, is written in Russian and has never
               been translated into English. What is remarkable about the
               book is that it describes many major results obtained in the
               Soviet Union (and originally published in papers by 1976)
               that were independently discovered later (and in some cases
               much later) in the West. The book also contains some minor
               results that we believe are still unknown in the West. The
               book is well-written and a pleasure to read, at least for
               someone fluent in Russian. Although the book is fifteen years
               old and we believe that all the major results contained in it
               are known in the West by now, the book is still of great
               historical importance. Hence a complete review is in order.
               [from the Introduction]
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1313/CS-TR-90-1313.pdf

%R CS-TR-90-1314
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Genetic programming: a paradigm for genetically breeding
               populations of computer programs to solve problems
%A Koza, John R.
%D June 1990
%X Many seemingly different problems in artificial intelligence,
               symbolic processing, and machine learning can be viewed as
               requiring discovery of a computer program that produces some
               desired output for particular inputs. When viewed in this
               way, the process of solving these problems becomes equivalent
               to searching a space of possible computer programs for a most
               fit individual computer program. The new "genetic
               programming" paradigm described herein provides a way to
               search for this most fit individual computer program. In this
               new "genetic programming" paradigm, populations of computer
               programs are genetically bred using the Darwinian principle
               of survival of the fittest and using a genetic crossover
               (recombination) operator appropriate for genetically mating
               computer programs. In this paper, the process of formulating
               and solving problems using this new paradigm is illustrated
               using examples from various areas.
               Examples come from the areas of machine learning of a
               function; planning; sequence induction; function function
               identification (including symbolic regression, empirical
               discovery, "data to function" symbolic integration, "data to
               function" symbolic differentiation); solving equations,
               including differential equations, integral equations, and
               functional equations); concept formation; automatic
               programming; pattern recognition, time-optimal control;
               playing differential pursuer-evader games; neural network
               design; and finding a game-playing strategyfor a discrete
               game in extensive form.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1314/CS-TR-90-1314.pdf

%R CS-TR-90-1318
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Techniques for improving the performance of sparse matrix
               factorization on multiprocessor workstations
%A Rothberg, Edward
%A Gupta, Anoop
%D June 1990
%X In this paper we look at the problem of factoring large
               sparse systems of equations on high-performance
               multiprocessor workstations. While these multiprocessor
               workstations are capable of very high peak floating point
               computation rates, most existing sparse factorization codes
               achieve only a small fraction of this potential. A major
               limiting factor is the cost of memory accesses performed
               during the factorization. ln this paper, we describe a
               parallel factorization code which utilizes the supernodal
               structure of the matrix to reduce the number of memory
               references. We also propose enhancements that significantly
               reduce the overall cache miss rate. The result is greatly
               increased factorization performance. We present experimental
               results from executions of our codes on the Silicon Graphics
               4D/380 multiprocessor. Using eight processors, we find that
               the supernodal parallel code achieves a computation rate of
               approximately 40 MFLOPS when factoring a range of benchmark
               matrices. This is more than twice as fast as the parallel
               nodal code developed at the Oak Ridge National Laboratory
               running on the SGI 4D/380.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1318/CS-TR-90-1318.pdf

%R CS-TR-90-1321
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Tools and rules for the practicing verifier
%A Manna, Z ohar
%A Pnueli, Amir
%D July 1990
%X The paper presents a minimal proof theory which is adequate
               for proving the main important temporal properties of
               reactive programs. The properties we consider consist of the
               classes of invariance, response, and precedence properties.
               For each of these classes we present a small set of rules
               that is complete for verifying properties belonging to this
               class. We illustrate the application of these rules by
               analyzing and verifying the properties of a new algorithm for
               mutual exclusion.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1321/CS-TR-90-1321.pdf

%R CS-TR-90-1323
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Protograms
%A Mozes, Eyal
%A Shoham, Yoav
%D July 1990
%X Motivated largely by tasks that require control of complex
               processes in a dynamic environment, we introduce a new
               computational construct called a protogram. A protogram is a
               program specifying an abstract course of action, a course
               that allows for a range of specific actions, from which a
               choice is made through interaction with other protograms. We
               discuss the intuition behind the notion, and then explore
               some of the details involved in implementing it.
               Specifically, we (a) describe a general scheme of protogram
               interaction, (b) describe a protogram interpreter that has
               been implemented, dealing with some special cases, (c)
               describe three applications of the protogram interpreter, one
               in data processing and two in robotics (both currently only
               implemented as simulations), (d) describe some more general
               possible implementations of a protogram interpreter, and (e)
               discuss how protograms can be useful for the Gofer project.
               We also briefly discuss the origins of protograms in
               psychology and linguistics, compare protograms to blackboard
               and subsumption architectures, and discuss directions for
               future research.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1323/CS-TR-90-1323.pdf

%R CS-TR-90-1324
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the complexity of monotonic inheritance with roles
%A Guerreiro, Ramiro A. de T.
%A Hemerly, S.
%A Shoham, Yoav
%D July 1990
%X We investigate the complexity of reasoning with monotonic
               inheritance hierarchies that contain, beside ISA edges, also
               ROLE (or FUNCTION) edges. A ROLE edge is an edge labelled
               with a name such as spouse of or brother of. We call such
               networks ISAR networks. Given a network with n vertices and m
               edges, we consider two problems: ($P_1$) determining whether
               the network implies an isa relation between two particular
               nodes, and ($P_2$) determining all isa relations implied by
               the network. As is well known, without ROLE edges the time
               complexity of $P_1$, is O(m), and the time complexity of
               $P_2$ is O($n^3$). Unfortunately, the results do not extend
               naturally to ISAR networks, except in a very restricted case.
               For general ISAR network we first give an polynomial
               algorithm by an easy reduction to proposional Horn theory. As
               the degree of the polynomial is quite high (O(m$n^4$) for
               $P_1$, O(m$n^6$) for $P_2$), we then develop a more direct
               algorithm. For both $P_1$ and $P_2$ its complexity is O($n^3
               + m^2$). Actually, a finer analysis of the algorithm reveals
               a complexity of O(nr(log r) + $n^2$r+ $n^3), where r is the
               number of different ROLE labels. One corolary is that if we
               fix the number of ROLE labels, the complexity of our
               algorithm drops back to O($n^3$).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1324/CS-TR-90-1324.pdf

%R CS-TR-90-1329
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An interleaving model for real time.
%A Henzinger, Thomas A.
%A Manna, Z ohar
%A Pnueli, Amir
%D September 1990
%X The interleaving model is both adequate and sufficiently
               abstract to allow for the practical specification and
               verification of many properties of concurrent systems. We
               incorporate real time into this model by defining the
               abstract notion of a real-time transition system as a
               conservative extension of traditional transition systems:
               qualitative fairness requirements are replaced (and
               superseded) by quantitative lower-bound and upper-bound
               real-time requirements for transitions. We present proof
               rules to establish lower and upper real-time bounds for
               response properties of real-time transition systems. This
               proof system can be used to verify bounded-invariance and
               bounded-response properties, such as timely terrnination of
               shared-variables multi-process systems, whose semantics is
               defined in terms of real-time transition systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1329/CS-TR-90-1329.pdf

%R CS-TR-90-1330
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel ICCG on a hierarchical memory
               multiprocessor - addressing the triangular solve bottleneck
%A Rothberg, Edward
%A Gupta, Anoop
%D October 1990
%X The incomplete Cholesky conjugate gradient (ICCG) algorithm
               is a commonly used iterative method for solving large sparse
               systems of equations. In this paper, we study the parallel
               solution of sparse triangular systems of equations, the most
               difficult aspect of implementing the ICCG method on a
               multiprocessor. We focus on shared-memory multiprocessor
               architectures with deep memory hierarchies. On such
               architectures we find that previously proposed
               parallelization approaches result in little or no speedup.
               The reason is that these approaches cause significant
               increases in the amount of memory system traffic as compared
               to a sequential approach. Increases of as much as a factor of
               10 on four processors were observed. In this paper we propose
               new techniques for limiting these increases, including data
               remappings to increase spatial locality, new processor
               synchronization techniques to decrease the use of auxiliary
               data structures, and data partitioning techniques to reduce
               the amount of interprocessor communication. With these
               techniques, memory system traffic is reduced to as little as
               one sixth of its previous volume. The resulting speedups are
               greatly improved as well, although they are still much less
               than linear. We discuss the factors that limit further
               speedups. We present both simulation results and results of
               experiments on an SGI 4D/340 multiprocessor.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1330/CS-TR-90-1330.pdf

%R CS-TR-90-1337
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A simplifier for untyped lambda expressions
%A Galbiati, Louis
%A Talcott, Carolyn
%D October 1990
%X Many applicative programming languages are based on the
               call-by-value lambda calculus. For these languages tools such
               as compilers, partial evaluators, and other transformation
               systems often make use of rewriting systems that incorporate
               some form of beta reduction. For purposes of automatic
               rewriting it is important to develop extensions of beta-value
               reduction and to develop methods for guaranteeing
               termination. This paper describes an extension of beta-value
               reduction and a method based on abstract interpretation for
               controlling rewriting to guarantee termination. The main
               innovations are (1) the use of rearrangement rules in
               combination with beta-value reduction to increase the power
               of the rewriting system and (2) the definition of a
               non-standard interpretation of expressions, the generates
               relation, as a basis for designing terminating strategies for
               rewriting.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1337/CS-TR-90-1337.pdf

%R CS-TR-90-1340
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Programming in QLisp
%A Mason, Ian A.
%A Pehoushek, Joseph D.
%A Talcott, Carolyn L.
%A Weening, Joseph S.
%D October 1990
%X Qlisp is an extension of Common Lisp, to support parallel
               programming. It was initially designed by John McCarthy and
               Richard Gabriel in 1984. Since then it has been under
               development both at Stanford University and Lucid, Inc. and
               has been implemented on several commercial shared-memory
               parallel computers. Qlisp is a queue-based, shared-memory,
               multi-processing language. This report is a tutorial
               introduction to the Stanford dialect of Qlisp.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1340/CS-TR-90-1340.pdf

%R CS-TR-90-1342
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Modeling concurrency with geometry
%A Pratt, Vaughan
%D November 1990
%X The phenomena of branching time and true or noninterleaving
               concurrency find their respective homes in automata and
               schedules. But these two models of computation are formally
               equivalent via Birkhoff duality, an equivalence we expound on
               here in tutorial detail. So why should these phenomena prefer
               one over the other? We identify dimension as the culprit:
               1-dimensional automata are skeletons permitting only
               interleaving concurrency, whereas rrue n-fold concurrency
               resides in transitions of dimension n. The truly concurrent
               automaton dual to a schedule is not a skeletal distributive
               lattice but a solid one! We introduce true nondeterminism and
               define it as monoidal homotopy; from this perspective
               nondeterminism in ordinary automata arises from forking and
               joining creating nontrivial homotopy. The automaton dual to a
               poset schedule is simply connected whereas that dual to an
               event structure schedule need not be, according to monoidal
               homotopy though not to group homotopy. We conclude with a
               formal definition of higher dimensional automaton as an
               n-complex or n-category, whose two essential axioms are
               associativity of concatenation within dimension and an
               interchange principle between dimensions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1342/CS-TR-90-1342.pdf

%R CS-TR-90-1343
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Action logic and pure induction
%A Pratt, Vaughan
%D November 1990
%X In Floyd-Hoare logic, programs are dynamic while assertions
               are static (hold at states). In action logic the two notions
               become one, with programs viewed as on-the-fly assertions
               whose truth is evaluated along intervals instead of at
               states. Action logic is an equational theory ACT
               conservatively extending the equational theory REG of regular
               expressions with operations preimplication a --> b (had a
               then b) and postimplication b <-- a (b if-ever a). Unlike
               REG, ACT is finitely based, makes $a^*$ reflexive transitive
               closure, and has an equivalent Hilbert system. The crucial
               axiom is that of pure induction, ${(a --> a)}^*$ = a --> a.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1343/CS-TR-90-1343.pdf

%R CS-TR-90-1344
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ParaDiGM: a highly scalable shared-memory multi-computer
               architecture
%A Cheriton, David R.
%A Goosen, Hendrik A.
%A Boyle, Patrick D.
%D November 1990
%X ParaDiGM is a highly scalable shared-memory multi-computer
               architecture. It is being developed to demonstrate the
               feasibility of building a relatively low-cost shared-memory
               parallel computer that scales to large configurations, and
               yet provides sequential programs with performance comparable
               to a high-end microprocessor. A key problem is building a
               scalable memory hierarchy. In this paper we describe the
               ParaDiGm architecture, highlighting the innovations of our
               approach and presenting results of our evaluation of the
               design. We envision that scalable shared-memory
               multiprocessors like ParaDiGM will soon become the dominant
               form of parallel processing, even for very large-scale
               computation, providing a uniform platform for parallel
               programming systems and applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1344/CS-TR-90-1344.pdf

%R CS-TR-90-1345
%Z Wed, 14 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Nonholonomic motion planning versus controllability via the
               multibody car system example
%A Laumond, Jean-Paul
%D December 1990
%X A multibody car system is a non-nilpotent, non-regular,
               triangularizable and well-controllable system. One goal of
               the current paper is to prove this obscure assertion. But its
               main goal is to explain and enlighten what it means.
               Motion planning is an already old and classical problem in
               Robotics. A few years ago a new instance of this problem has
               appeared in the literature: motion planning for nonholonomic
               systems. While useful tools in motion planning come from
               Computer Science and Mathematics (Computational Geometry,
               Real Algebraic Geometry), nonholonomic motion planning needs
               some Control Theory and more Mathematics (Differential
               Geometry).
               First of all, this paper tries to give a computational
               reading of the tools from Differential Geometric Control
               Theory required by planning. Then it shows that the presence
               of obstacles in the real world of a real robot challenges
               Mathematics with some difficult questions which are
               topological in nature, and have been solved only recently,
               within the framework of Sub-Riemannian Geometry.
               This presentation is based upon a reading of works recently
               developed by (1) Murray and Sastry, (2) Lafferiere and
               Sussmann, and (3) Bellaiche, Jacobs and Laumond.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/90/1345/CS-TR-90-1345.pdf

%R CS-TR-95-1534
%Z Mon, 23 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Partial Information Based Integrity Constraint Checking
%A Gupta, Ashish
%D January 1995
%X Integrity constraints are useful for specifying consistent
               states of a database, especially in distributed database
               systems where data may be under the control of multiple
               database managers. Constraints need to be checked when the
               underlying database is updated. Integrity constraint checking
               in a distributed environment may involve a distributed
               transaction and the expenses associated with it: two phase
               commit protocols, distributed concurrency control, network
               communication costs, and multiple interface layers if the
               databases are heterogeneous. The information used for
               constraint checking may include the contents of base
               relations, constraint specifications, updates to the
               databases, schema restrictions, stored aggregates etc. We
               propose using only a subset of the information potentially
               available for constraint checking. Thus, only data that is
               local to a site may be used for constraint checking thus
               avoiding distributed transactions. The approach is useful
               also in centralized systems because relatively inexpensively
               accessible subsets may be used for constraint checking. We
               discuss constraint checking for the following three subsets
               of the afore mentioned information.
               1. Constraint Subsumption: How to check one constraint C
               using a set of other constraint specifications {C0,...,Cn}
               and no data, and the knowledge that the constraints in set
               {C0,...,Cn} hold in the database? 2. Irrelevant Updates. How
               to check a constraint C using the database update, a set of
               other constraints {C0,...,Cn, and the knowledge that the
               constraints {C,C0,...,Cn} all hold before the update? 3.
               Local Checking. How to check a constraint C using the
               database update, the contents of the updated relation, a set
               of other constraints {C0,...,Cn}, and the knowledge that the
               constraints {C,C0,...,Cn} all hold before the update?
               Local checking is the main focus and the main contribution of
               this thesis.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1534/CS-TR-95-1534.pdf

%R CS-TR-95-1535
%Z Mon, 23 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Random Networks in Configuration Space for Fast Path Planning
%A Kavraki, Lydia E.
%D January 1995
%X In the main part of this dissertation we present a new path
               planning method which computes collision-free paths for
               robots of virtually any type moving among stationary
               obstacles. This method proceeds according to two phases: a
               preprocessing phase and a query phase. In the preprocessing
               phase, a probabilistic network is constructed and stored as a
               graph whose nodes correspond to collision-free configurations
               and edges to feasible paths between these configurations. In
               the query phase, any given start and goal configurations of
               the robot are connected to two nodes of the network; the
               network is then searched for a path joining these two nodes.
               We apply our method to articulated robots with many degrees
               of freedom. Experimental results show that path planning can
               be done in a fraction of a second on a contemporary
               workstation ($\approx$ 150 MIPS), after relatively short
               preprocessing times (a few dozen to a few hundred seconds).
               In the second part of this dissertation, we present a new
               method that uses the the Fast Fourier Transform to compute
               the obstacle map required by certain path planning
               algorithms. In the final part of this dissertation, we
               consider a problem from assembly planning. In assembly
               planning we are interested in generating feasible sequences
               of motions that construct a mechanical product from its
               individual parts. We prove that the monotone assembly
               partitioning problem in the plane is NP-complete.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1535/CS-TR-95-1535.pdf

%R CS-TR-95-1536
%Z Mon, 23 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Locomotion With A Unit-Modular Reconfigurable Robot
%A Yim, Mark
%D January 1995
%X A unit-modular robot is a robot that is composed of modules
               that are all identical. Here we study the design and control
               of unit-modular dynamically reconfigurable robots. This is
               based upon the design and construction of a robot called
               Polypod. We further choose statically stable locomotion as
               the task domain to evaluate the design and control strategy.
               The result is the creation of many unique locomotion modes.
               To gain insight into the capabilities of robots like Polypod
               we examine locomotion in general by building a functional
               taxonomy of locomotion. We show that Polypod is capable of
               generating all classes of statically stable locomotion, a
               feature unique to Polypod. Next, we propose methods to
               evaluate vehicles under different operating conditions such
               as different terrain conditions. We then evaluate and compare
               each mode of locomotion on Polypod. This study leads to
               interesting insights into the general characteristics of the
               corresponding classes of locomotion.
               Finally, since more modules are expected to increase robot
               capability, it is important to examine the limit to the
               number of modules that can be put together in a useful form.
               We answer this question by investigating the issues of
               structural stability, actuator strength, computation and
               control requirements.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1536/CS-TR-95-1536.pdf

%R CS-TR-95-1537
%Z Mon, 23 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Real-Time Modification of Collision-Free Paths
%A Quinlan, Sean
%D January 1995
%X The modification of collision-free paths is proposed as the
               basis for a new framework to close the gap between global
               path planning and real-time sensor-based robot control. A
               physically-based model of a flexible string-like object,
               called an elastic band, is used to determine the modification
               of a path. The initial shape of the elastic is the free path
               generated by a planner. Subjected to artificial forces, the
               elastic band deforms in real time to a short and smooth path
               that maintains clearance from the obstacles. The elastic
               continues to deform as changes in the environment are
               detected by sensors, enabling the robot to accommodate
               uncertainties and react to unexpected and moving obstacles.
               While providing a tight connection between the robot and its
               environment, the elastic band preserves the global nature of
               the planned path.
               The greater part of this thesis deals with the design and
               implementation of elastic bands, with emphasis on achieving
               real-time performance even for robots with many degrees of
               freedom. To achieve these goals, we propose the concept of
               bubbles of free-space---a region of free-space around a given
               configuration of the robot generated from distance
               information. We also develop a novel algorithm for
               efficiently computing the distance between non-convex objects
               and a real-time algorithm for calculating a discrete
               approximation to the time-optimal parameterization of a path.
               These various developments are combined in a system that
               demonstrates the elastic band framework for a Puma 560
               manipulator.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1537/CS-TR-95-1537.pdf

%R CS-TR-95-1538
%Z Fri, 27 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T 1994 Publications Summary of the Stanford Database Group
%A Hammer, Joachim
%D January 1995
%X This technical report contains the first four pages of papers
               written by members of the Stanford Database Group during
               1994. We believe that the first four pages convey the main
               ideas behind each paper better than a simple title and
               abstract does.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1538/CS-TR-95-1538.pdf

%R CS-TR-95-1539
%Z Tue, 31 Jan 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reasoning About Uncertainty in Robot Motion Planning
%A Lazanas, Anthony
%D August 1994
%X In this thesis, we investigate the effects of uncertainty on
               the difficulty of robot motion planning, and we study the
               tradeoff between physical and computational complexity. We
               present a formulation of the general robot motion planning
               with uncertainty problem, so that a complete, correct,
               polynomial planner can be derived. The key idea is the
               existence of reduced uncertainty regions in the workspace
               (landmark regions).
               Planning is performed using the preimage backchaining method.
               We extend the standard definition of a ``nondirectional
               preimage'' to the case where a motion command depends on an
               arbitrary number of control parameters. The resulting
               multi-dimensional preimage can be represented with a
               polynomial number of 2-D slices, each computed for a critical
               combination of values of the parameters. We present
               implemented algorithms for one parameter (the commanded
               direction of motion) and for two parameters (the commanded
               direction of motion and the directional uncertainty).
               Experimentation with the algorithm using a real mobile robot
               has been successful. By engineering the workspace, we have
               been able to satisfy all the assumptions of our planning
               model. As a result, the robot has been able to operate for
               long periods of time with no failures.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1539/CS-TR-95-1539.pdf

%R CS-TR-95-1542
%Z Fri, 10 Feb 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel Genetic Programming on a Network of Transputers
%A Koza, John R.
%A Andre, David
%D January 1995
%X This report describes the parallel implementation of genetic
               programming in the C programming language using a PC 486 type
               computer (running Windows) acting as a host and a network of
               transputers acting as processing nodes. Using this approach,
               researchers of genetic algorithms and genetic programming can
               acquire computing power that is intermediate between the
               power of currently available workstations and that of
               supercomputers at a cost that is intermediate between the
               two.
               A comparison is made of the computational effort required to
               solve the problem of symbolic regression of the Boolean
               even-5-parity function with different migration rates.
               Genetic programming required the least computational effort
               with an 8% migration rate. Moreover, this computational
               effort was less than that required for solving the problem
               with a serial computer and a panmictic population of the same
               size. That is, apart from the nearly linear speed-up in
               executing a fixed amount of code inherent in the parallel
               implementation of genetic programming, parallelization
               delivered more than linear speed-up in solving the problem
               using genetic programming.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1542/CS-TR-95-1542.pdf

%R CS-TR-95-1543
%Z Tue, 14 Feb 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stereo Without Search
%A Tomasi, Carlo
%A Manduchi, Roberto
%D February 1995
%X Search is not inherent in the correspondence problem. We
               propose a representation of images, called intrinsic curves,
               that combines the ideas of associative storage of images with
               connectedness of the representation: intrinsic curves are the
               paths that a set of local image descriptors trace as an image
               scanline is traversed from left to right. Curves become
               surfaces when full images are considered instead of
               scanlines. Because only the path in the space of descriptors
               is used for matching, intrinsic curves lose track of space,
               and are invariant with respect to disparity under ideal
               circumstances. Establishing stereo correspondences then
               becomes a trivial lookup problem. We also show how to use
               intrinsic curves to match real images in the presence of
               noise, brightness bias, contrast fluctuations, and moderate
               geometric distortion, and we show how intrinsic curves can be
               used to deal with image ambiguity and occlusions. We carry
               out experiments on single-scanline matching to prove the
               feasibility of the approach and illustrate its main features.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1543/CS-TR-95-1543.pdf

%R CS-TR-95-1546
%Z Fri, 17 Mar 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Symbolic Approximations for Verifying Real-Time Systems
%A Wong-Toi, Howard
%D December 1994
%X Real-time systems are appearing in more and more applications
               where their proper operation is critical, e.g. transport
               controllers and medical equipment. However they are extremely
               difficult to design correctly. One approach to this problem
               is the use of formal description techniques and automatic
               verification. Unfortunately automatic verification suffers
               from the state-explosion problem even without considering
               timing information. This thesis proposes a state-based
               approximation scheme as a heuristic for efficient yet
               accurate verification.
               We first describe a generic iterative approximation algorithm
               for checking safety properties of a transition system.
               Successively more accurate approximations of the reachable
               states are generated until the specification is provably
               satisfied or not. The algorithm automatically decides where
               the analysis needs to be more exact, and uses state
               partitioning to force the approximations to converge towards
               a solution. The method is complete for finite-state systems.
               The algorithm is applied to systems with hard real-time
               bounds. State approximations are performed over both timing
               information and control information. We also approximate the
               system's transition structure. Case studies include some
               timing properties of the MAC sublayer of the Ethernet
               protocol, the tick-tock service protocol, and a timing-based
               communication protocol where the sender's and receiver's
               clocks advance at variable rates.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1546/CS-TR-95-1546.pdf

%R CS-TR-95-1540
%Z Fri, 03 Feb 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Model-Matching and Individuation for Model-Based Diagnosis
%A Murdock, Janet L.
%D January 1995
%X In model-based systems that reason about the physical world,
               models are attached to portions of the physical system. To
               make model-based systems more extensible and re-usable, this
               thesis explores automating model-matching. Models address
               particular individuals, portions of the physical world
               identified as separate entities. If the set of models is not
               fixed, one cannot carve the physical system into a fixed set
               of individuals. Our goals are to develop methods for matching
               and individuating and identify characteristics of physical
               equipment and models required by those methods. Our approach
               is to identify a set of characteristics, build a system which
               used them, and test re-usability and extensibility. If the
               system correctly defines individuals and matches models, even
               when models calls for individuals not previously defined,
               then we can conclude that we have identified some subset of
               the characteristics required. The system matches models to a
               series of equipment descriptions, simulating re-use. We also
               add a number of models, extending the system, having it match
               the new models. Our investigation shows characteristics
               required are the 3-dimensional space and how the space is
               filled by functional components, phases, materials, and
               parameters.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1540/CS-TR-95-1540.pdf

%R CS-TR-95-1541
%Z Tue, 07 Feb 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Random Sampling in Graph Optimization Problems
%A Karger, David R.
%D February 1995
%X The representative random sample is a central concept of
               statistics. It is often possible to gather a great deal of
               information about a large population by examining a small
               sample randomly drawn from it. This approach has obvious
               advantages in reducing the investigator's work, both in
               gathering and in analyzing the data. We apply the concept of
               a representative sample to combinatorial optimization. Our
               focus is optimization problems on undirected graphs.
               Highlights of our results include: The first (randomized)
               linear time minimum spanning tree algorithm; A (randomized)
               minimum cut algorithm with running time roughly O(n^2) as
               compared to previous roughly O(n^3) time bounds, as well as
               the first algorithm for finding all approximately minimal
               cuts and multiway cuts; An efficient parallelization of the
               minimum cut algorithm, providing the first parallel (RNC)
               algorithm for minimum cuts; A derandomization finding minimum
               cut in NC; Provably accurate approximations to network
               reliability; Very fast approximation algorithms for minimum
               cuts, s-t minimum cuts, and maximum flows; Significantly
               improved polynomial-time approximation bounds for network
               design problems; For coloring 3-colorable graphs,
               improvements in the approximation bounds from O(n^{3/8}) to
               O(n^{1/4}); An analysis of random sampling in Matroids.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1541/CS-TR-95-1541.pdf

%R CS-TR-95-1544
%Z Tue, 14 Feb 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On Diameter Verification and Boolean Matrix Multiplication.
%A Basch, Julien
%A Khanna, Sanjeev
%A Motwani, Rajeev
%D February 1995
%X We present a practical algorithm that verifies whether a
               graph has diameter 2 in time O(n^{3} / log^{2} n}). A slight
               adaptation of this algorithm yields a boolean matrix
               multiplication algorithm which runs in the same time bound;
               thereby allowing us to compute transitive closure and
               verification of the diameter of a graph for any constant $d$
               in O(n^{3} / log^{2} n}) time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1544/CS-TR-95-1544.pdf

%R CS-TR-95-1545
%Z Tue, 14 Feb 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Approximation Algorithms for the Largest Common Subtree
               Problem.
%A Khanna, Sanjeev
%A Motwani, Rajeev
%A Yao, Frances F.
%D February 1995
%X The largest common subtree problem is to find a largest
               subtree which occurs as a common subgraph in a given
               collection of trees. We show that in case of bounded degree
               trees, we can achieve an approximation ratio of O(( n*loglog
               n ) / log^{2} n). In case of unbounded degree nodes, we give
               an algorithm with approximation ratio O(( n*(loglog n)^{2}) /
               log^{2} n) when the trees are unlabeled. An approximation
               ratio of O(( n*(loglog n)^{2} ) / log^{2} n) is also achieved
               for the case of labeled unbounded degree trees provided the
               number of distinct labels is O(log^{O(1)} n).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1545/CS-TR-95-1545.pdf

%R CS-TR-95-1547
%Z Fri, 05 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sharp, Reliable Predictions using Supervised Mixture Models
%A Roy, H. Scott
%D March 1995
%X This dissertation develops a new way to make probabilistic
               predictions from a database of examples. The method looks for
               regions in the data where different predictions are
               appropriate, and it naturally extends clustering algorithms
               that have been used with great success in exploratory data
               analysis. In probabilistic terms, the new method looks at the
               same models as before, but it only evaluates them for the
               conditional probability they assign to a single feature
               rather than the joint probability they assign to all
               features. A good models is therefore forced to classify the
               data in a way that is useful for a single, desired
               prediction, rather than just identifying the strongest
               overall pattern in the data.
               The results of this dissertation extend the clean, Bayesian
               approach of the unsupervised AutoClass system to the
               supervised learning problems common in everyday practice.
               Highlights include clear probabilistic semantics, prediction
               and use of discrete, categorical, and continuous data, priors
               that avoid the overfitting problem, an explicit noise model
               to identify unreliable predictions, and the ability to handle
               missing data.
               A computer implementation, MultiClass, validates the ideas
               with performance that exceeds neural nets, decision trees,
               and other current supervised machine learning systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1547/CS-TR-95-1547.pdf

%R CS-TR-95-1549
%Z Thu, 11 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Dynamic Selection of Models
%A Rutledge, Geoffrey William
%D March 1995
%X This dissertation develops an approach to high-stakes,
               model-based decision making under scarce computation
               resources, bringing together concepts and techniques from the
               disciplines of decision analysis, statistics, artificial
               intelligence, and simulation. A method is developed and
               implemented to solve a time-critical decision problem in the
               domain of critical-care medicine. This method selects models
               that balance the prediction accuracy and the need for rapid
               action. Under a computation-time constraint, the optimal
               model for a model-based control application is a model that
               maximizes the tradeoff of model benefit (a measure of how
               accurately the model predicts the effects of alternative
               control settings) and model cost (a measure of the length of
               the model-induced computation delay). This work describes a
               real-time algorithm that selects, from a graph of models
               (GoM), a model that is accurate and that is computable within
               a time constraint. The DSM algorithm is a metalevel reasoning
               strategy that relies on a dynamic-selection-of-models (DSM)
               metric to guide the search through a GoM that is organized
               according to the simplifying assumptions of the models. The
               DSM metric balances an estimate of the probability that a
               model will achieve the required prediction accuracy and the
               cost of the expected model-induced computation delay. The DSM
               algorithm provides an approach to automated reasoning about
               complex systems that applies at any level of
               computation-resource or computation-time constraint. The DSM
               algorithm is implemented in Konan, a program that performs
               dynamic selection of patient-specific models from a GoM of
               quantitative physiologic models. Konan selects models that
               allow a model-based control application (a
               ventilator-management advisor) to make real-time decisions
               for the control settings of a mechanical ventilator.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1549/CS-TR-95-1549.pdf

%R CS-TR-95-1550
%Z Wed, 24 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Theory and Design of a Hybrid Pattern Recognition System
%A Drakopoulos, John A.
%D May 1995
%X Pattern recognition methods can be divided into four
               different categories: statistical or probabilistic,
               structural, possibilistic or fuzzy, and neural methods. A
               formal analysis shows that there is a computational
               complexity versus representational power trade-off between
               probabilistic and possibilistic or fuzzy set measures, in
               general. Furthermore, sigmoidal theory shows that fuzzy set
               membership can be represented effectively by sigmoidal
               functions. Those results and the formalization of sigmoidal
               functions and subsequently multi-sigmoidal functions and
               neural networks led to the development of a hybrid pattern
               recognition system called tFPR.
               tFPR is a hybrid fuzzy, neural, and structural pattern
               recognition system that uses fuzzy sets to represent
               multi-variate pattern classes that can be either static or
               dynamic depending on time or some other parameter space.
               The membership functions of the fuzzy sets that represent
               pattern classes are modeled in three different ways. Simple
               sigmoidal configurations are used for simple patterns, a
               structural pattern recognition method is used for dynamic
               patterns, and multi-sigmoidal neural networks are used for
               pattern classes for which is difficult to obtain a formal
               definition.
               Although efficiency is a very important consideration in
               tFPR, the main issues are knowledge acquisition and knowledge
               representation (in terms of pattern class descriptions).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1550/CS-TR-95-1550.pdf

%R CS-TR-95-1548
%Z Wed, 10 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Routing and Admission Control in General Topology Networks
%A Gawlick, Rainer
%A Kamath, Anil
%A Plotkin, Serge
%A Ramakrishnan, K. G.
%D May 1995
%X Emerging high speed Broadband Integrated Services Digital
               Networks (B-ISDN) will carry traffic for services such as
               video-on-demand and video teleconferencing -- that require
               resource reservation along the path on which the traffic is
               sent. As a result, such networks will need effective {\em
               admission control} algorithms.
               The simplest approach is to use greedy admission control; in
               other words, accept every resource request that can be
               physically accommodated. However, in the context of symmetric
               loss networks (networks with a complete graph topology),
               non-greedy admission control has been shown to be more
               effective than greedy admission control.
               This paper suggests a new {\em non-greedy} routing and
               admission control algorithm for {\em general topology}
               networks. In contrast to previous algorithms, our algorithm
               does not require advance knowledge of the traffic patterns.
               Our algorithm combines key ideas from a recently developed
               theoretical algorithm with a stochastic analysis developed in
               the context of reservation-based algorithms.
               We evaluate the performance of our algorithm using extensive
               simulations on an existing commercial network topology and on
               variants of that topology. The simulations show that our
               algorithm outperforms greedy admission control over a broad
               range of network environments. The simulations also
               illuminate some important characteristics of our algorithm.
               For example, we characterize the importance of the implicit
               routing effects of the admission control part of our
               algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1548/CS-TR-95-1548.pdf

%R CS-TR-95-1552
%Z Mon, 17 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Embedded Teaching of Reinforcement Learners
%A Brafman, Ronen I.
%A Tennenholtz, Moshe
%D June 1995
%X Knowledge plays an important role in an agent's ability to
               perform well in its environment. Teaching can be used to
               improve an agent's performance by enhancing its knowledge. We
               propose a specific model of teaching, which we call embedded
               teaching. An embedded teacher is an agent situated with a
               less knowledgeable ``student'' in a common environment. The
               teacher's goal is to lead the student to adopt a particular
               desired behavior. The teacher's ability to teach is affected
               by the dynamics of the common environment and may be limited
               by a restricted repertoire of actions or uncertainty about
               the outcome of actions; we explicitly represent these
               limitations as part of our model. In this paper, we address a
               number of theoretical issues including the characterization
               of a challenging embedded teaching domain and the computation
               of optimal teaching policies. We then incorporate these ideas
               in a series of experiments designed to evaluate our ability
               to teach two types of reinforcement learners.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1552/CS-TR-95-1552.pdf

%R CS-TR-95-1553
%Z Wed, 19 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Modeling techniques and algorithms for probabilistic
               model-based diagnosis and repair
%A Srinivas, Sampath
%D July 1995
%X Model-based diagnosis centers on the use of a behavioral
               model of a system to infer diagnoses of anomalous behavior.
               For model-based diagnosis techniques to become practical,
               some serious problems in the modeling of uncertainty and in
               the tractability of uncertainty management have to be
               addressed. These questions include: How can we tractably
               generate diagnoses in large systems? Where do the prior
               probabilities of component failure come from when modeling a
               system? How do we tractably compute low-cost repair
               strategies? How can we do diagnosis even if only partial
               descriptions of device operation are available? This
               dissertation seeks to bring model-based diagnosis closer to
               being a viable technology by addressing these problems.
               We develop a set of tractable algorithms and modeling
               techniques that address each of the problems introduced
               above. Our approach synthesizes the techniques used in
               model-based diagnosis and techniques from the field of
               Bayesian networks.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1553/CS-TR-95-1553.pdf

%R CS-TR-95-1554
%Z Mon, 24 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Computer Science Technical Report (CS-TR) Project:
               Considerations from the Library Perspective.
%A Lasher, Rebecca
%A Reich, Vicky
%A Anderson, Greg
%D July 1995
%X In 1992 the Advanced Research Projects Agency (ARPA) funded a
               three year grant to investigate the questions related to
               large-scale, distributed, digital libraries. The award
               focused research on Computer Science Technical Reports
               (CS-TR) and was granted to the Corporation for National
               Research Initiatives (CNRI) and five research universities.
               The ensuing collaborative research has focused on a broad
               spectrum of technical, social, and legal issues, and has
               encompassed all aspects of a very large, heterogeneous
               distributed digital library environment: acquisition,
               storage, organization, search, retrieval, display, use and
               intellectual property. The initial corpus of this digital
               library is a coherent digital collection of CS-TRs created at
               the five participating universities: Carnegie Mellon,
               Cornell, MIT, Stanford, and the University of California at
               Berkeley. The Corporation for National Research Initiatives
               serves as a collaborator and agent for the project.
               This technical report summarizes the accomplishments and
               collaborative efforts of the CS-TR project from a librarian's
               perspective; to do this we address the following questions:
               1. Why do librarians and computer scientists make good
               research partners? 2. What has been learned? 3. What new
               questions have been articulated? 4. How can the
               accomplishments be moved into a service environment? 5. What
               actions and activities might follow from this effort?
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1554/CS-TR-95-1554.pdf

%R CS-TR-95-1551
%Z Mon, 10 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Two Methods for Checking Formulas of Temporal Logic
%A McGuire, Hugh W.
%D June 1995
%X This dissertation presents two methods for determining
               satisfiability or validity of formulas of Discrete Metric
               Annotated Linear Temporal Logic. This logic is convenient for
               representing and verifying properties of reactive and
               concurrent systems, including software and electronic
               circuits.
               The first method presented here is an algorithm for
               automatically deciding whether any given propositional
               temporal formula is satisfiable. This new algorithm
               efficiently extends the classical `semantic tableau'-algorithm
               to formulas with temporal operators which refer to the
	       past or are metric. Then, whereas classical proofs of
               correctness for such algorithms are existential, the proof
               here is constructive; it shows that for any given formula
               being checked, any model of the formula is embedded in the
               tableau.
               The second method presented in this dissertation is a
               deduction-calculus for determining the validity of predicate
               temporal formulas. This new deduction-calculus employs a
               refined, conservative version of classical approaches
               involving translation from temporal forms to first-order
               expressions with time reified. Here, quantifications are
               elided, and addition is used instead of classical complicated
               combinations of comparisons. This scheme facilitates
               integration of powerful techniques such as
               associative-commutative unification and a Presburger
               decision-algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1551/CS-TR-95-1551.pdf

%R CS-TR-95-1556
%Z Tue, 12 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Solving Unweighted and Weighted Bipartite Matching Problems
               in Theory and Practice
%A Kennedy, J. Robert, Jr.
%D August 1995
%X The push-relabel method has been shown to be efficient for
               solving maximum flow and minimum cost flow problems in
               practice, and periodic global updates of dual variables have
               played an important role in the best implementations.
               Nevertheless, global updates had not been known to yield any
               theoretical improvement in running time. In this work, we
               study techniques for implementing push-relabel algorithms to
               solve bipartite matching and assignment problems. We show
               that global updates yield a theoretical improvement in the
               bipartite matching and assignment contexts, and we develop a
               suite of efficient cost-scaling push-relabel implementations
               to solve assignment problems.
               For bipartite matching, we show that a push-relabel algorithm
               using global updates matches the best time bound known
               (roughly the number of edges times the square root of the
               number of nodes --- better for dense graphs) and performs
               worse by a factor of the square root of the number of nodes
               without the updates. We present a similar result for the
               assignment problem, for which an algorithm that assumes
               integer costs has running time asymptotically dominated by
               the number of edges times the number of nodes times a scaling
               factor logarithmic in the number of nodes and the largest
               magnitude of an edge cost in the problem. The bound we obtain
               matches the best cost-scaling bound known.
               We develop cost-scaling push-relabel implementations that
               take advantage of the assignment problem's special structure,
               and compare our codes against the best codes from the
               literature. The results show that the push-relabel method is
               very promising for practical use.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1556/CS-TR-95-1556.pdf

%R CS-TR-95-1555
%Z Mon, 11 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Real-time Database Experiences in Network Management
               Application
%A Kiriha, Yoshiaki
%D September 1995
%X This report discusses our experiences with real-time
               databases in the context of a network management system, in
               particular a MIB (Management Information Base)
               implementation. We propose an active and real-time MIB
               (ART-MIB) architecture that utilizes a real-time database
               system. The ART-MIB contains a variety of modules, such as
               transaction manager, task manager, and resource manager.
               Among the functionalities provided by ART-MIB, we focus on
               transaction scheduling within a memory based real-time
               database system. For the developed ART-MIB prototype, we have
               evaluated two typical real-time transaction scheduling
               algorithms: earliest deadline first (EDF) and highest value
               first (HVF). The main results of our performance comparison
               show that EDF outperforms HVF under a low load; however, HVF
               outperforms EDF in an overload situation. Furthermore, the
               fact that the performance crossover point closely depends on
               the magnitude of the scheduler queue, has been validated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1555/CS-TR-95-1555.pdf

%R CS-TR-95-1557
%Z Wed, 11 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Hierarchical Models of Synchronous Circuits for Formal
               Verification and Substitution
%A Wolf, Elizabeth Susan
%D October 1995
%X We develop a mathematical model of synchronous sequential
               circuits that supports both formal hierarchical verification
               and substitution. We have implemented and proved the
               correctness of automatic decision procedures for both of
               these applications using these models.
               For hierarchical verification, we model synchronous circuit
               specifications and implementations uniformly. Each of these
               descriptions provides both a behavioral and a structural view
               of the circuit or specification being modeled. We compare the
               behavior of a circuit model to a requirements specification
               in order to determine whether the circuit is an acceptable
               implementation of the specification. Our structural view of a
               circuit provides the capability to plug in one circuit
               component in place of another. We derive a requirements
               specification for the acceptable replacement components, in
               terms of the desired behavior of the full circuit. We also
               support nondeterministic specifications, which capture the
               minimum requirements of a circuit.
               Previous formalisms have relied on syntactic methods for
               distinguishing apparent from actual unlatched feedback loops
               in hierarchical hardware designs. However, these methods are
               not applicable to nondeterministic models. Our model of the
               behavior of a synchronous circuit within a single clock cycle
               provides a semantic method to identify cyclic dependencies
               even in the presence of nondeterminism.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1557/CS-TR-95-1557.pdf

%R CS-TR-95-1558
%Z Mon, 04 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Designing an Academic Firewall: Policy, Practice and
               Experience With SURF
%A Greenwald, Michael B.
%A Singhal, Sandeep K.
%A Stone, Jonathan R.
%A Cheriton, David R.
%D December 1995
%X Corporate network firewalls are well-understood and are
               becoming commonplace. These firewalls establish a security
               perimeter that aims to block (or heavily restrict) both
               incoming and outgoing network communication. We argue that
               these firewalls are neither effective nor appropriate for
               academic or corporate research environments needing to
               maintain information security while still supporting the free
               exchange of ideas.
               In this paper, we present the Stanford University Research
               Firewall (SURF), a network firewall design that is suitable
               for a research environment. While still protecting
               information and computing resources behind the firewall, this
               firewall is less restrictive of outward information flow than
               the traditional model; can be easily deployed; and can give
               internal users the illusion of unrestricted e-mail, anonymous
               FTP, and WWW connectivity to the greater Internet. Our
               experience demonstrates that an adequate firewall for a
               research environment can be constructed for minimal cost
               using off-the-shelf software and hardware components.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1558/CS-TR-95-1558.pdf

%R CS-TR-95-1559
%Z Fri, 12 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the number of equilibrium placements of mass distributions
               in elliptic potential fields
%A Kavraki, Lydia E.
%D December 1995
%X Recent papers have demonstrated the use of force fields for
               mechanical part orientation. The force field is realized on a
               plane on which the part is placed. The forces exerted on the
               part's contact surface translate and rotate the part to an
               equilibrium orientation. Part manipulation by force fields is
               very attractive since it requires no sensing. We describe
               force fields that result from elliptic potentials and induce
               only 2 stable equilibrium orientations for most parts. The
               proposed fields represent a considerable improvement over
               previously developed force fields which produced O(n)
               equilibria for polygonal parts with n vertices. The
               successful realization of these force fields could
               significantly affect part manipulation in industrial
               automation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1559/CS-TR-95-1559.pdf

%R CS-TR-95-1560
%Z Fri, 12 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Wrappers for Performance Enhancements and Oblivious Decision
               Graphs.
%A Kohavi, Ron
%D September 1995
%X In this doctoral dissertation, we study three basic problems
               in machine learning and two new hypothesis spaces with
               corresponding learning algorithms. The problems we
               investigate are: accuracy estimation, feature subset
               selection, and parameter tuning. The latter two problems are
               related and are studied under the wrapper approach. The
               hypothesis spaces we investigate are: decision tables with a
               default majority rule (DTMs) and oblivious read-once decision
               graphs (OODGs).
               For accuracy estimation, we investigate cross-validation and
               the~.632 bootstrap. We show examples where they fail and
               conduct a large scale study comparing them. We conclude that
               repeated runs of five-fold cross-validation give a good
               tradeoff between bias and variance for the problem of model
               selection used in later chapters.
               We define the wrapper approach and use it for feature subset
               selection and parameter tuning. We relate definitions of
               feature relevancy to the set of optimal features, which is
               defined with respect to both a concept and an induction
               algorithm. The wrapper approach requires a search space,
               operators, a search engine, and an evaluation function. We
               investigate all of them in detail and introduce compound
               operators for feature subset selection. Finally, we abstract
               the search problem into search with probabilistic estimates.
               We introduce decision tables with a default majority rule
               (DTMs) to test the conjecture that feature subset selection
               is a very powerful bias. The accuracy of induced DTMs is
               surprisingly powerful, and we concluded that this bias is
               extremely important for many real-world datasets. We show
               that the resulting decision tables are very small and can be
               succinctly displayed.
               We study properties of oblivious read-once decision graphs
               (OODGs) and show that they do not suffer from some inherent
               limitations of decision trees. We describe a a general
               framework for constructing OODGs bottom-up and specialize it
               using the wrapper approach. We show that the graphs produced
               are use less features than C4.5, the state-of-the-art
               decision tree induction algorithm, and are usually easier for
               humans to comprehend.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1560/CS-TR-95-1560.pdf

%R CS-TR-95-1561
%Z Tue, 16 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Techniques for Efficient Formal Verification Using Binary
               Decision Diagrams
%A Hu, Alan John
%D December 1995
%X The appeal of automatic formal verification is that it's
               automatic -- minimal human labor and expertise should be
               needed to get useful results and counterexamples. BDD(binary
               decision diagram)-based approaches have promised to allow
               automatic verification of complex, real systems. For large
               classes of problems, however, (including many distributed
               protocols, multiprocessor systems, and network architectures)
               this promise has yet to be fulfilled. Indeed, the few
               successes have required extensive time and effort from
               sophisticated researchers in the field. This thesis
               identifies several common obstacles to BDD-based automatic
               formal verification and proposes techniques to overcome them
               by avoiding building certain problematic BDDs needed in the
               standard approaches and by exploiting automatically generated
               and user-supplied don't-care information. Several examples
               illustrate the effectiveness of the new techniques in
               enlarging the envelope of problems that can routinely be
               verified automatically.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1561/CS-TR-95-1561.pdf

%R CS-TR-95-1562
%Z Tue, 16 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T STeP: The Stanford Temporal Prover (Educational Release)
               User's Manual
%A Bjorner, Nikolaj
%A Browne, Anca
%A Chang, Eddie
%A Colon, Michael
%A Kapur, Arjun
%A Manna, Z ohar
%A Sipma, Henny B.
%A Uribe, Tomas E.
%D November 1995
%X The STeP (Stanford Temporal Prover) system supports the
               computer-aided verification of reactive and real-time
               systems. It combines deductive methods with algorithmic
               techniques to allow the verification of a broad class of
               systems, including infinite-state systems and parameterized
               N-process programs.
               STeP provides the visual language of verification diagrams
               that allow the user to construct proofs hierarchically,
               starting from a high-level proof sketch. The availability of
               automatically generated bottom-up and top-down invariants and
               an integrated suite of decision procedures allow most
               verification conditions to be checked without user
               intervention.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/95/1562/CS-TR-95-1562.pdf

%R CS-TR-87-1142
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Heuristic Refinement for Spacial Constraint Satisfaction
               Problems
%A Brinkley, J.
%A Buchanan, B.
%A Altman, R.
%A Duncan, B.
%A Cornelius, C.
%D January 1987
%X The problem of arranging a set of physical objects according
               to a set of constraints is formulated as a geometric
               constraint satisfaction problem (GCSP), in which the
               variables are the objects, the possible locations of the
               objects are the possible values for the variables, and the
               constraints are geometric constraints between objects. A GCSP
               is a type of multidimensional constraint satisfaction problem
               in which the number of objects and/or the number of possible
               locations per object is too large to permit direct solution
               by backtrack search. A method is described for reducing these
               numbers by refinement along two dimensions. The number of
               objects is reduced by refinement of the structure,
               representing a group of objects as a single abstract object
               before considering each object individually. The abstraction
               used depends on domain specific knowledge. The number of
               locations per object is reduced by applying node and arc
               consistency algorithms to refine the accessible volume of
               each object. Heuristics are employed to control the order of
               operations (and hence to affect the efficiency of search) but
               not to change the correctness in the sense that no solutions
               that would be found by backtrack search are eliminated.
               Application of the method to the problem of protein structure
               determination is described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1142/CS-TR-87-1142.pdf

%R CS-TR-87-1144
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Considerations for Multiprocessor Typologies
%A Byrd, Gregory
%A Delagi, Bruce
%D January 1987
%X Choosing a multiprocessor interconnection topology may depend
               on high-level considerations, such as the intended
               application domain and the expected number of processors. It
               certainly depends on low-level implementation details, such
               as packaging and communications protocols. We first use rough
               measures of cost and performance to characterize several
               topologies. We then examine how implementation details can
               affect the realizable performance of a topology.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1144/CS-TR-87-1144.pdf

%R CS-TR-87-1146
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Point-to-Point Multicast Communications Protocol
%A Byrd, Gregory
%A Nakano, Russell
%A Delagi, Bruce
%D February 1987
%X Many network topologies have been proposed for connecting a
               large number of processor-memory pairs in a high-performance
               multiprocessor system. In terms of performance, however, the
               communications protocol decisions may be as crucial as
               topology. This paper describes a protocol to support
               point-to-point interprocessor communications with multicast.
               Dynamic, cut- through routing with local flow control is used
               to provide a high-throughput, low latency communications path
               between processors. In addition, multicast transmissions are
               available, in which copies of a packet are sent to multiple
               destinations using common resources as much as possible.
               Special packet terminators and selective buffering are
               introduced to avoid deadlock during multicasts. A simulated
               implementation of the protocol is also described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1146/CS-TR-87-1146.pdf

%R CS-TR-87-1147
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Layered Environment for Reasoning about Action
%A Hayes-Roth, B.
%A Garvey, A.
%A Johnson, M. V.
%A Hewett, M.
%D November 1986
%X An intelligent systems reasons about -- controls, explains,
               learns about -- its action, thereby improving its efforts to
               achieve goals and function in its environment. In order to
               perform effectively, a system must have knowledge of the
               actions it can perform, the events and states that occur, and
               the relationships among instances of those actions, events
               and states. We represent such knowledge in a hiearchy of
               knowledge abstractions and impose uniform standards of
               knowledge content and representation on modules within each
               hierarchical level. We refer to the evolving set of such
               modules as the BB* environment. To illustrate, we describe
               selected elements of BB*:
               * the foundational BB1 architecture
               * the ACCORD framework for solving arrangement problems by
               means of an assembly method
               * two applications of BB1-ACCORD, the PROTEAN system for
               modeling protein structures and the SIGHTPLAN system for
               designing construction-site layouts
               * two hypothetical multifaceted systems that integrate
               ACCORD, PROTEAN and SIGHTPLAN with other possible BB*
               frameworks and applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1147/CS-TR-87-1147.pdf

%R CS-TR-87-1148
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Instrumented Architectural Simulation System
%A Delagi, B.
%A Saraiya, N.
%A Nishimura, S.
%A Byrd, G.
%D January 1987
%X Simulation of systems at an architectural level can offer an
               effective way to study critical design choices if
               1. the performance of the simulator is adequate to examine
               designs executing significant code bodies -- not just toy
               problems or small application fragments
               2. the details of the simulation include the critical details
               of the design
               3. The view of the design presented by the simulator
               instrumentation leads to useful insights on the problems with
               the design
               4. there is enough flexibility in the simulation system so
               that the asking of unplanned questions is not suppressed by
               the weight of the mechanics involved in making changes either
               in the design or its measurement.
               A simulation system with these goals is described together
               with the approach to its implementation. Its application to
               the study of a particular class of multiprocessor hardware
               system architectures is illustrated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1148/CS-TR-87-1148.pdf

%R CS-TR-87-1149
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Proceedings from the Nineteenth Annual Meeting of the
               Stanford Computer Forum
%A Millen, K. Mac
%A Diaz-Barriga, A.
%A Tajnai, C.
%D February 1987
%X Operating for almost two decades, the Stanford Computer Forum
               is a cooperative venture of the Computer Science Department
               and the Computer Systems Laboratory (a laboratory operated
               jointly by the Computer Science and Electrical Engineering
               Departments). CSD and CSL are intemationally recognized for
               their excellence; their faculty members, research staff, and
               students are widely known for leadership in developing new
               ideas and trends in the organization, design and use of
               computers. They are in the forefront of applying research
               results to a wide range of applications.
               The Forum holds an annual meeting in February to which three
               representatives of each member company are invited. The
               meeting lasts two days and features technical sessions at
               which timely computer research at Stanford is described by
               advanced graduate students and faculty members. There are
               opportunities for informal discussions to complement the
               presentations.
               This report includes information on the Forum, the program,
               abstracts of the talks and viewgraphs used in the
               presentations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1149/CS-TR-87-1149.pdf

%R CS-TR-87-1153
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Optimum Grip of a Polygon
%A Markenscoff, Xanthippi
%A Papadimitriou, Christos
%D April 1987
%X It has been shown by Baker, Fortune and Grosse that any
               two-dimensional polygonal object can be prehended stably with
               three fingers, so that its weight (along the third dimension)
               is balanced. Besides, in this paper we show that form closure
               of a polygon object can be achieved by four fingers (previous
               proofs were not complete). We formulate and solve the problem
               of finding the optimum stable grip or form closure of any
               given polygon. For stable grip it is most natural to minimize
               the forces needed to balance through friction the object's
               weight along the third dimension. For form closure, we
               minimize the worst-case forces needed to balance any unit
               force acting on the center of gravity of the object. The
               mathematical techniques used in the two instances are an
               interesting mix of Optimization and Euclidean geometry. Our
               results lead to algorithms for the efficient computation of
               the optimum grip in each case.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1153/CS-TR-87-1153.pdf

%R CS-TR-87-1154
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Programming and Problem-Solving Seminar
%A Rokicki, T. G.
%A Knuth, D. E.
%D April 1987
%X This report contains edited transcripts of the discussions
               held in Stanford's course CS304, Problem Seminar, during
               winter quarter 1987. Since the topics span a large range of
               ideas in computer science, and since most of the important
               research paradigms and programming paradigms were touched on
               during the discussions, these notes may be of interest to
               graduate students of computer science at other universities,
               as well to their professors and to professional people in the
               "real world."
               The present report is the seventh in a series of such
               transcripts, continuing the tradition established in STAN-
               CS-77-606 (Michael J. Clancy, 1977), STAN-CS-79-707 (Chris
               Van Wyk, 1979), STAN-CS-81-863 (Allan A. Miller, 1981),
               STAN-CS-83-989 (Joseph S. Weening, 1983), STAN-CS-83-990
               (John D. Hobby, 1983), and STAN-CS-85-1055 (Ramsey W. Haddad,
               1985).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1154/CS-TR-87-1154.pdf

%R CS-TR-87-1155
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Experiments in Automatic Theorem Proving
%A Bellin, G.
%A Ketonen, J.
%D December 1986
%X The experiments described in this report are proofs in EKL of
               properties of different LISP programs operating different
               representations of the same mathematical structures -- finite
               permutations. EKL is an interactive proof checker based upon
               the language of higher order logic, higher order unification
               and a decision procedure for a fragment of first order logic.
               The following questions are asked: What representations of
               mathematical structure and facts are better suited for
               formalization and also applicable to several interesting
               situations? What methods and strategies will make it possible
               to prove automatically an extensive body of mathematical
               knowledge? Can higher order logic be conveniently applied in
               the proof of elementary facts?
               The fact (*) that finite permutations form a group is proved
               from the axioms of arithmetic and elementary set theory, via
               the "Pigeon Hole Principle" (PHP). Permutations are
               represented (1) as association lists and (2) as lists of
               numbers. In representation (2) operations on permutations are
               represented (2.1) using predicates (2.2) using functions.
               Proofs of (*) using the different representations are
               compared.
               The results and conclusions include the following. Methods to
               control the rewriting process and to replace logic inference
               by high order rewriting are presented. PHP is formulated as a
               second order statement which is then easily applied to (1)
               and (2). This demonstrates the value of abstract, higher
               order formulation of facts for application in different
               contexts. A case is given in which representation of
               properties of programs by predicates may be more convenient
               than by functions. Evidence is given that convenient
               organization of proofs into lemmata is essential for large
               scale computer aided theorem proving.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1155/CS-TR-87-1155.pdf

%R CS-TR-87-1156
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Dynamic Tree Expression Problem
%A Mayr, Ernst W.
%D May 1987
%X Presented is a uniform method for obtaining efficient
               parallel algorithms for a rather large class of problems. The
               method is based on a logic programming model, and it derives
               its efficiency form fast parallel routines for the evaluation
               of expression trees.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1156/CS-TR-87-1156.pdf

%R CS-TR-87-1157
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Network Implementation of the DTEP Algorithm
%A Mayr, E. W.
%A Plaxton, C. G.
%D May 1987
%X The dynamic tree expression problem (DTEP) was defined in
               [Ma87]. In this paper, efficient implementations of the DTEP
               algorithm are developed for the hypercube, butterfly, perfect
               shuffle and multidimensional mesh of trees families of
               networks.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1157/CS-TR-87-1157.pdf

%R CS-TR-87-1159
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Muir: A Tool for Language Design
%A Winograd, Terry A.
%D May 1987
%X Muir is a language design environment, intended for use in
               creating and experimenting with languages such as programming
               languages, specification languages, grammar forrnalisms, and
               logical notations. It provides facilities for a language
               designer to create a language specification, which controls
               the behavior of generic language manipulating tools typically
               found in a language-specific environment, such as structure
               editors, interactive interfaces, storage management and
               attribute analysis. It is oriented towards use with evolving
               languages, providing for mixed structures (combining
               different versions), semi-automated updating of structures
               from one language version to another, and incremental
               language specification. A new hierarchical grammar formalism
               serves as the framework for language specification, with
               multiple presentation formalisms and a unified interactive
               environment based on an extended notion of edit operations. A
               prototype version is operating and has been tested on a small
               number of languages.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1159/CS-TR-87-1159.pdf

%R CS-TR-87-1160
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Strategic Computing Research and the Universities
%A Winograd, Terry A.
%D March 1987
%X The Strategic Computing Initiative offers the potential of
               new research funds for university computer science
               departments. As with all funds, they bring benefits and can
               have unwanted strings attached. In the case of military
               funding, the web of attached strings can be subtle and
               confusing. The goal of this paper is to delineate some of
               these entanglements and perhaps provide some guidance for
               loosening and eliminating them.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1160/CS-TR-87-1160.pdf

%R CS-TR-87-1166
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel Execlltion of OPSS in QLISP
%A Okuna, H. G.
%A Gupta, A.
%D June 1987
%X Production systems (or rule-based systems) are widely used
               for the development of expert systems. To speed-up the
               execution of production systems, a number of different
               approaches are being taken, a majority of them being based on
               the use of parallelism. In this paper, we explore the issues
               involved in the parallel implementation of OPS5 (a widely
               used production-system language) in QLISP (a parallel dialect
               of Lisp proposed by John McCarthy and Richard Gabriel). This
               paper shows that QLISP can easily encode most sources of
               parallelism in OPS5 that have been previously discussed in
               literature. This is significant because the OPS5 interpreter
               is the first large program to be encoded in QLISP, and as a
               result, this is the first practical demonstration of the
               expressive power of QLISP. The paper also lists the most
               commonly used QLISP constructs in the parallel implementation
               (and the contexts in which they are used), which serve as a
               hint to the QLISP implementor about what to optimize. Also
               discussed is the exploitation of speculative parallelism in
               RHS-evaluation for OPSS. This has not been previously
               discussed in the literature.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1166/CS-TR-87-1166.pdf

%R CS-TR-87-1168
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Representing Control Knowledge as Abstract Task and Metarules
%A Clancey, W. J.
%A Bock, C.
%D April 1985
%X A poorly designed knowledge base can be as cryptic as an
               arbitrary program and just as difficult to maintain.
               Representing inference procedures abstractly, separately from
               domain facts and relations, makes the design more transparent
               and explainable. The combination of abstract procedures and a
               relational language for organizing domain knowledge provides
               a generic framework for constructing knowledge bases for
               related problems in other domains and also provides a useful
               starting point for studying the nature of strategies. In
               HERACLES, inference procedures are represented as abstract
               metarules, expressed in a form of the predicate calculus,
               organized and controlled as rule sets. A compiler converts
               the rules into Lisp code and allows domain relations to be
               encoded as arbitrary data structures for efficiency. Examples
               are given of the explanation and teaching capabilities
               afforded by this representation. Different perspectives for
               understanding HERACLES' inference procedure and how it
               defines knowledge bases are discussed in some detail.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1168/CS-TR-87-1168.pdf

%R CS-TR-87-1170
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Viewing Knowledge Bases as Qualitative Models
%A Clancey, William J.
%D May 1986
%X The concept of a qualitative model provides a unifying
               perspective for understanding how expert systems differ from
               conventional programs. Knowledge bases contain qualitative
               models of systems in the world, that is primarily non-numeric
               descriptions that provide a basis for explaining and
               predicting behavior and formulating action plans. The
               prevalent view that a qualitative model must be a simulation,
               to the exclusion of prototypic and behavioral descriptions,
               has fragmented our field, so that we have failed to usefully
               synthesize what we have learned about modeling processes. For
               example, our ideas about "scoring functions" and "casual
               network traversal," developed apart from a modeling
               perspective, have obscured the inherent explanatory nature of
               diagnosis. While knowledge engineering has greatly benefited
               from the study of human experts as a means of informing model
               construction, overemphasis on modeling the expert's knowledge
               has detracted from the primary objective of modeling a system
               in the world. Placing AI squarely in the evolutionary line of
               telelogic and topologic modeling, this talk argues that the
               study of network representations has established a foundation
               for a science and engineering of qualitative models.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1170/CS-TR-87-1170.pdf

%R CS-TR-87-1173
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Review of Winograd and Flores' Understanding Computers and
               Cognition
%A Clancey, William J.
%D July 1986
%X AI researchers and cognitive scientists commonly believe that
               thinking involves manipulating representions. Thinking
               involves search, inference, and making choice. This is how we
               model reasoning and what goes on in the brain is similar.
               Winograd and Flores present a radically different view. They
               claim that our knowledge is not represented in the brain at
               all, but rather consists of an unformalized shared
               background, from which we articulate representations in order
               to cope with new situations. In constrast, computer programs
               contain only pre-selected objects and properties, and there
               is no basis for moving beyond this initial formalization when
               breakdown occurs.
               Winograd and Flores provide convincing arguments with
               examples familiar to most AI researchers. However, they
               significally understate the role of representation in
               mediating intelligent behavior, specifically in the process
               of reflection, when representations are generated prior to
               physical action. Furthermore, they do not consider the
               practical benefits of expert systems and the extent of what
               can be accomplished. Nevertheless, the book is crisp and
               stimulating. It should make AI researchers more cautious
               about what they are doing, more aware of the nature of
               formalization, and more open to alternative views.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1173/CS-TR-87-1173.pdf

%R CS-TR-87-1174
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Intelligent Tutoring Systerns: A Tutorial Survey
%A Clancey, William J.
%D September 1986
%X This survey of Intelligent Tutoring Systems is based on a
               tutorial originally presented by John Seely Brown, Richard R.
               Burton (Xerox - PARC, USA) and William J. Clancey at the
               National Conference on AI (AAAI) in Austin, TX in August,
               1984. The survey describes the components of tutoring
               systems, different teaching scenarios, and their relation to
               a theory of instruction. The underlying pedagogical approach
               is to make latent knowledge manifest, which the research
               accomplishes by different forms of qualitative modeling:
               simulating physical processes; simulating expert
               problem-solving, including strategies for montoring and
               controling problem solving (metacognition); modeling the
               plans behind procedural behavior; and forcing articulation of
               model inconsistencies through the Socratic method of
               instruction.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1174/CS-TR-87-1174.pdf

%R CS-TR-87-1177
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Log Files: An Extended File Service Exploiting Write-Once
               Storage
%A Finlayson, R. S.
%A Cheriton, D. R.
%D August 1987
%X A log service provides efficient storage and retrieval of
               data that is written sequentially (append-only) and not
               subsequently modified. Application programs an subsystems use
               log services for recovery, to record security audit trails,
               and for perforrnance monitoring. Ideally, a log service
               should accomodate very large, long-lived logs, and provlde
               efficient retrieval and low space overhead.
               In this paper, we describe the design and implementation of
               the Clio log service. Clio provides the abstraction of log
               files: readable, append-only files that are accessed in the
               same way as conventional files. The underlying storage medium
               is required only to be append-only; more general types of
               write access are not necessary. We show how log files can be
               implemented efficiently and robustly on top of such storage
               media - in particular, write-once. optical disk. In addition,
               we describe a general application software storage
               architecture that makes use of log files.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1177/CS-TR-87-1177.pdf

%R CS-TR-87-1175
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Using and Evaluating Differential Modeling in Intelligent
               Tutoring and Apprentice Learning Systems
%A Wilkin, D. C.
%D January 1987
%X A powerful approach to debugging and refining the knowledge
               structures of a problem solving agent is to differentially
               model the actions of the agent against a gold standard. This
               paper proposes a framework for exploring the inherent
               limitations of such an approach when a problem solver is
               differentially modeled againt an expert system. A procedure
               is described for determining a performance upper bound for
               debugging via differential modeling, called the synthetic
               agent method. The synthetic agent method systematically
               explores the space of near miss training instances and
               expresses the limits of debugging in terrns of the knowledge
               representation and control language constructs of the expert
               system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1175/CS-TR-87-1175.pdf

%R CS-TR-87-1178
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Dynamic, Cut-Through Communications Protocol with Multicast
%A Byrd, G. T.
%A Nakano, R.
%A Delagi, B. A.
%D September 1987
%X This paper describes a protocol to support point-to-point
               vinterprocessor communications with multicast. Dynamic,
               cut-through routing with local flow control is used to
               provide a high-throughput, low-latency communications path
               between processors. In addition, multicast transmissions are
               available, in which copies of a packet are sent to multiple
               destinations using common resources as much as possible.
               special packet terminators and selective buffering are
               introduced to avoid deadlock during multicasts. A simulated
               implementation of the protocol is also described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1178/CS-TR-87-1178.pdf

%R CS-TR-87-1180
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Bibliography; Department of Computer Science Technical
               Reports, 1963-1988
%A Na, Taleen M.
%D January 1988
%X This report lists, in chronological order, all reports
               published by the Stanford Computer Science Department (CSD)
               since 1963. Each report is identified by CSD number, author's
               name, title, number of pages, and date. If a given report is
               available from the department at the time of the
               Bibliography's printing, price is listed. For convenience, an
               author index, ordering information, codes, and alternative
               sources are also included.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1180/CS-TR-87-1180.pdf

%R CS-TR-87-1181
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On Debugging Rule Sets When Reasoning Under Uncertainty
%A Wilkins, D. C.
%A Buchanan, B. G.
%D May 1987
%X Heuristic inference rules with a measure of strength less
               than certainty have an unusual property: better individual
               rules do not necessarily lead to a better overall rule set.
               All less-than-certain rules contribute evidence towards
               erroneous conclusions for some problem instances, and the
               distribution of these erroneous conclusions over the
               instances is not necessarily related to individual rule
               quality. This has important consequences for automatic
               machine learning of rules, since rule selection is usually
               based on measures of quality of individual rules.
               In this paper, we explain why the most obvious and
               intuitively reasonable solution to this probelm, incremental
               modification and deletion of rules responsible for wrong
               conclusions a la Teiresias, is not always appropriate. In our
               experience, it usually fails to converge to an optimal set of
               rules. Given a set of heuristic rules, we explain why the
               best rule set should be considered to be the element of the
               power set of rules that yields a global minimum error with
               respect to generating erroneous positive and negative
               conclusions. This selection process is modeled as a bipartite
               graph minimization problem and shown to be NP-complete. A
               solution method is described, the Antidote Algorithm, that
               performs a model-directed search of the rule space. On an
               example from medical diagnosis, the Antitdote Algortithm
               signif1cantly reduced the number of misdiagnoses when applied
               to a rule set generated from 104 training instances.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1181/CS-TR-87-1181.pdf

%R CS-TR-87-1182
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Knowledge Base Refinement by Monitoring Abstract Control
               Knowledge
%A Wilkins, D. C.
%A Buchanan, B. G.
%D August 1987
%X An explicit representation of the problem solving method of
               an expert system shell as abstract control knowledge provides
               a powerful foundation for learning. This paper describes the
               abstract control knowledge of the Heracles expert system
               shell for heuristic classification problems, and describes
               how the Odysseus apprenticeship learning program uses this
               representation to automate "end-game" knowledge acquisition.
               Particular emphasis is given to showing how abstract control
               knowledge facilitates the use of underlying domain theories
               by a learning program.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1182/CS-TR-87-1182.pdf

%R CS-TR-87-1183
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Knowledge Engineer as Student: Metacognitive bases for
               asking good questions
%A Clancey, W. J.
%D January 1987
%X Knowledge engineers are efficient, active leamers. They
               systematically approach domains and acquire knowledge to
               solve routine, practical problems. By modeling their methods,
               we may develop a basis for teaching other students how to
               direct their own learning. In particular, a knowledge
               engineer is good at detecting gaps in a knowledge base and
               asking focused questions to improve an expert system's
               performance. This ability stems from domain-general knowledge
               about: problem-solving procedures, the categorization of
               routine problem-solving knowledge, and domain and task
               differences. this paper studies these different forms of
               metaknowledge, and illustrates its incorporation in an
               intelligent tutoring system. A model of learning is presented
               that describes how the knowledge engineer detects
               problem-solving failures and tracks them back to gaps in
               domain knowledge, which are then reformulated as questions to
               ask a teacher. We describe how this model of active learning
               is being developed and tested in a knowledge acquisition
               program for an expert system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1183/CS-TR-87-1183.pdf

%R CS-TR-87-1184
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Firmware Approach to Fast Lisp Interpreter
%A Okuno, H.
%A Osato, N.
%A Takeuchi, I.
%D September 1987
%X The approach to speed up a Lisp interpreter by implementing
               it in firmware seems promising. A microcoded Lisp interpreter
               shows good performance for very simple benchmarks, while it
               often fails to provide good performance for larger benchmarks
               and applications unless speedup techniques are devised for
               it. This was the case for the TAO/ELIS system. This paper
               describes various techniques devised for the TAO/ELIS system
               in order to speed up the interpreter of the TAO language
               implemented on the ELIS Lisp machine. The techniques include
               data type dispatch, variable access, function call and so on.
               TAO is not only upward compatible with Common Lisp, but also
               incorporates logic programming, object-oriented programming
               and Fortran/C-like programming into Lisp programming. TAO
               also provides concurrent programming and supports multiple
               users (up to eight users). The TAO interpreter for those
               programming paradigms is coded fully in microcodes. In spite
               of rich functionalities, the speed of interpreted codes of
               TAO is comparable to that of compiled codes of commercial
               Lisp machines. Furthermore, the speeds of the interpreted
               codes of the same program written in various prograrnming
               paradigms in TAO does not differ so much. This speed balance
               is very important for the user.
               Another outstanding feature of the TAO/ELIS system is its
               firmware development environments. Micro Assembler and Linker
               are written in TAO, which enables the user to use the
               capability of TAO in microcodes. Since debugging tools are
               also written in mini-Lisp, many new tools were developed in
               parallel to debugging of microcodes. This high level approach
               to firmware development environments is very important to
               provide high productivity of development.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1184/CS-TR-87-1184.pdf

%R CS-TR-87-1185
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Blazenet: A Photonic Implementable Wide-Area Network
%A Haas, Z .
%A Cheriton, D. R.
%D December 1987
%X High-performance wide-area networks are required to
               interconnect clusters of computers connected by local area
               and metropolitan area networks. Optical fiber technology
               provides long distance channels in the multi-gigabit per
               second range. The challenge is to provide switching nodes
               that handle these data rates with minimum delay, and at a
               reasonable cost.
               In this paper, we describe a packet switching network,
               christened Blazenet, that provides low delay and has minimal
               memory requirements. It can be extended to support multicast
               and priority delivery. Such a network can revolutionize the
               opportunities for distributed command and control,
               information and resources sharing, real-time conferencing,
               and wide-area parallel computation, to mention but a few
               applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1185/CS-TR-87-1185.pdf

%R CS-TR-87-1186
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Hierarchy of Temporal Properties
%A Manna, Z ohar
%A Pnueli, Amir
%D October 1987
%X We propose a classification of temporal properties into a
               hierarchy which refines the known safety-liveness
               classification of properties. The new classification
               recognizes the classes of safety, guarantee, persistence,
               fairness, and hyper-fairness. The classification suggested
               here is based on the different ways a property of finite
               computations can be extended into a property of infinite
               computations. For properties that are expressible by temporal
               logic and predicate automata, we provide a syntactic
               characterization of the formulae and automata that specify
               properties in the different classes. We consider the
               verification of properties over a given program, and provide
               a unique proof principle for each class.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1186/CS-TR-87-1186.pdf

%R CS-TR-87-1188
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Experiments with a Knowledge-Based System on a Multiprocessor
%A Nakano, Russell
%A Minami, Masafumi
%D October 1987
%X This paper documents the results we obtained and the lessons
               we learned in the design, implementation, and execution of a
               simulated real-time application on a simulated parallel
               processor. Specifically, our parallel program ran 100 times
               faster on a 100-processor multiprocessor.
               The machine architecture is a distributed-memory
               multiprocessor. The target machine consists of 10 to 1000
               processors, but because of simulator limitations, we ran
               simulations of machines consisting of 1 to 100 processors.
               Each processor is a computer with its own local memory,
               executing an independent instruction stream. There is no
               global shared memory; all processes communicate by message
               passing. The target programming environment, called Lamina,
               encourages a programming style that stresses performance
               gains through problem decomposition, allowing many processors
               to be brought to bear on a problem. THe key is to distribute
               the processing load over replicated objects, and to incresase
               throughput by building pipelined sequences of objects that
               handle stages of problem solving.
               We focused on a knowledge-based application that simulates
               real-time understanding of radar tracks, called Airtrac. This
               paper describes a portion of the Airtrac application
               implemented in Lamina and a set of experiments that we
               performed. We confirmed the following hypotheses: 1)
               Performance of our concurrent program improves with
               additional processors, and thereby attains a significant
               level of speedup. 2) Correctness of our concurrent program
               can be maintained despite a high degree of problem
               decomposition and highly overloaded input data conditions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1188/CS-TR-87-1188.pdf

%R CS-TR-87-1189
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Instrumented Architectural Simulation
%A Delagi, Bruce A.
%A Saraiya, Nakul
%A Nishimura, Sayuri
%A Byrd, Greg
%D November 1987
%X Simulation of systems at an architectural level can offer an
               effective way to study critical design choices if (1) the
               performance of the simulator is adequate to examine designs
               executing significant code bodies -- not just toy problems or
               small application fragments, (2) the details of the
               simulation include the critical details of the design, (3)
               the view of the design presented by the simulator
               instrumentation leads to useful insights on the problems with
               the design, and (4) there is enough flexibility in the
               simulation system so that the asking of unplanned questions
               is not suppressed by the weight of the mechanics involoved in
               making changes either in the design or its measurement. A
               simulation system with these goals is described together with
               the approach to its implementation. Its application to the
               study of a particular class of multiprocessor hardsware
               system architectures is illustrated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/87/1189/CS-TR-87-1189.pdf

%R CS-TR-88-1195
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Lower Bound for Radio Broadcast
%A Bar-Noy, A.
%A Linial, N.
%A Peleg, D.
%D February 1988
%X A radio network is a synchronous network of processors that
               communicate by transmitting messages to their neighbors,
               where a processor receives a message in a given step if and
               only if it is silent in this step and precisely one of its
               neighbors transmits. In this paper we prove the existence of
               a family of radius-2 networks on n vertices for which any
               broadcast schedule requires at least Omega((log n/ log log
               n)2) rounds of transmissions. This almost matches an upper
               bound of O(log2 n) rounds for networks of radius 2 proved
               earlier by Bar-Yehuda, Goldreich, and Itai.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1195/CS-TR-88-1195.pdf

%R CS-TR-88-1196
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Motion Planning with Uncertainty: The Preimage Backchaining
               Approach
%A Latombe, Jean-Claude
%D March 1988
%X This paper addresses the problem of planning robot motions in
               the presence of uncertainty. It explores an approach to this
               problem, known as the preimage backchaining approach.
               Basically, a preimage is a region in space, such that if the
               robot executes a certain motion command from within this
               region, it is guaranteed to attain a target and to terminate
               into it. Preimage backchaining consists of reasoning backward
               from a given goal region, by computing preimages of the goal,
               and then recursively preimages of the preimages, until some
               preimages include the initial region where it is known at
               planning time that the robot will be before executing the
               motion plan.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1196/CS-TR-88-1196.pdf

%R CS-TR-88-1197
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The VMP Multiprocessor: Initial Experience, Refinements and
               Performance Evaluation
%A Cheriton, D. R.
%A Gupta, A.
%A Boyle, P. D.
%A Goosen, H. A.
%D March 1988
%X VMP is an experimental multiprocessor being developed at
               Stanford University, suitable for high-performance
               workstations and server machines. Its primary novelty lies in
               the use of software management of the preprocessor caches and
               the design decisions in the cache and bus that make this
               approach feasible. The design and some uniprocessor
               trace-driven simulations indicating its perforrnance have
               been reported previously.
               In this paper, we present our initial experience with the VMP
               design based on a running prototype as well as various
               refinements to the design. Performance evaluation is based
               both on measurement of actual execution as well as
               trace-driven simulation of multiprocessor executions from the
               Mach operating system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1197/CS-TR-88-1197.pdf

%R CS-TR-88-1199
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Projections of Vector Addition System Reachability Sets are
               Semilinear
%A Buning, H. K.
%A Lettman, T.
%A Mayr, E. W.
%D March 1988
%X The reachability sets of Vector Addition Systems of dimension
               six or more can be non-semilinear. This may be one reason why
               the inclusion problem (as well as the equality problem) for
               reachability sets of vector addition systems in general is
               undecidable, even though the reachability problem itself is
               known to be decidable. We show that any one-dimensional
               projection of the reachability set of an arbitrary vector
               addition system is semilinear, and hence, "simple".
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1199/CS-TR-88-1199.pdf

%R CS-TR-88-1200
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel Approximation Algorithms for Bin Packing
%A Anderson, R. J.
%A Mayr, E. W.
%A Warmuth, M. K.
%D March 1988
%X We study the parallel complexity of polynomial heuristics for
               the bin packing problem. We show that some well-known (and
               simple) moethods like first-fit- decreasing are P-complete,
               and it is hence very unlikely that they can be efficiently
               parallelized. On the other hand, we exhibit an optimal NC
               algorithm that achieves the same performance bound as does
               FFD. Finally, we discuss parallelization of polynomial
               approximation algorithms for bin packing based on
               discretization.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1200/CS-TR-88-1200.pdf

%R CS-TR-88-1203
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the Semantics of Temporal Logic Prograrnrning
%A Baudinet, Marianne
%D June 1988
%X Recently, several researchers have suggested directly
               exploiting in a programming language temporal logic's ability
               to describe changing worlds. The resulting languages are
               quite diverse. They are based on different subsets of
               temporal logic and use a variety of execution mechanisms. So
               far, little attention has been paid to the formal semantics
               of these languages. In this paper, we study the semantics of
               an instance of temporal logic programming, namely, the
               TEMPLOG language defined by Abadi and Manna. We first give
               declarative semantics for TEMPLOG, in model-theoretic and in
               fixpoint terms. Then, we study its operational semantics and
               prove soundness and completeness theorems for the
               temporal-resolution proof method underlying its execution
               mechanism.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1203/CS-TR-88-1203.pdf

%R CS-TR-88-1206
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Parallel Lisp Simulator
%A Weening, Joseph S.
%D May 1988
%X CSIM is a simulator for parallel Lisp, based on a
               continuation passing interpreter. It models a shared-memory
               multiprocessor executing programs written in Common Lisp,
               extended with several primitives for creating and controlling
               processes. This paper describes the structure of the
               simulator, measures its performance, and gives an examples
               its use with a parallel Lisp program.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1206/CS-TR-88-1206.pdf

%R CS-TR-88-1208
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Toetjes
%A Feder, Tomas
%D June 1988
%X A number is secretly chosen from the interval [0, 1], and n
               players try to guess this number. When the secret number is
               revealed, the player with the closest guess wins. We describe
               an optimal strategy for a version of this game.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1208/CS-TR-88-1208.pdf

%R CS-TR-88-1209
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Combinatorial Algorithms for the Generalized Circulation
               Problem
%A Goldberg, A. V.
%A Plotkin, S. A.
%A Tardos, E.
%D June 1988
%X We consider a generalization of the maximum flow problem in
               which the amounts of flow entering and leaving an arc are
               linearly related. More precisely, if x(e) units of flow enter
               an arc e, x(e) gamma(e) units arrive at the other end. For
               instance, nodes of the graph can correspond to different
               currencies, with the multipliers being the exchange rates. We
               require conservation of flow at every node except a given
               source node. The goal is to maximize the amount of flow
               excess at the source.
               This problem is a special case of linear programming, and
               therefore can be solved in polynomial time. In this paper we
               present the first polynomial time combinatorial algorithms
               for this problem. The algorithms are simple and intuitive.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1209/CS-TR-88-1209.pdf

%R CS-TR-88-1211
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sublinear-Time Parallel Algorithms
%A Goldberg, A. V.
%A Plotkin, S. A.
%A Vaidya, P. M.
%D June 1988
%X This paper presents the first sublinear-time deterministic
               parallel algorithms for bipartite matching and several
               related problems, including maximal node-disjoint paths,
               depth-first search, and flows in zero-one networks. Our
               results are based on a better understanding of the
               combinatorial structure of the above problems, which leads to
               new algorithmic techniques. In particular, we show how to use
               maximal matching to extend, in parallel, a current set of
               node-disjoint paths and how to take advantage of the
               parallelism that arises when a large number of nodes are
               "active" during an execution of a push/relabel network flow
               algorithm.
               We also show how to apply our techniques to design parallel
               algorithms for the weighted versions of the above problems.
               In particular, we present sublinear-time deterministic
               parallel algorithms for finding a minimum-weight bipartite
               matching and for finding a minimum-cost flow in a network
               with zero-one capacities, if the weights are polynomially
               bounded integers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1211/CS-TR-88-1211.pdf

%R CS-TR-88-1210
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T String-Functional Semantics for Formal Verification of
               Synchronous Circuits
%A Bronstein, Alexandre
%A Talcott, Carolyn L.
%D June 1988
%X A new functional semantics is proposed for synchronous
               circuits, as a basis for reasoning formally about that class
               of hardware systems. Technically, we define an extensional
               semantics with monotonic length-preserving functions on
               finite strings, and an intensional semantics based on
               functionals on those functions. As support for the semantics
               we prove the equivalence of the extensional semantics with a
               simple operational semantics, as well as a characterization
               of circuits which obey the "every loop is clocked" design
               rule.
               Also, we develop the foundations in complete detail both to
               increase confidence in the theory, and as a prerequisite to
               its future mechanization.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1210/CS-TR-88-1210.pdf

%R CS-TR-88-1214
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Multicast Routing in Internetworks and Extended LANs
%A Deering, Stephen E.
%D June 1988
%X Multicasting is used within local-area networks to make
               distributed applications more robust and more efficient. The
               growing need to distribute applications across multiple,
               interconnected networks, and the increasing availability of
               high-performance, high-capacity switching nodes and networks,
               lead us to consider providing LAN-style multicasting across
               an internetwork. In this paper, we propose extensions to two
               common internetwork routing algorithms -- distance-vector
               routing and link-state routing -- to support low-delay
               datagram multicasting. We also suggest modifications to the
               single-spanning-tree routing algorithm, commonly used by
               link-layer bridges, to reduce the costs of multicasting in
               large extended LANs. Finally, we show how different
               link-layer and network-layer multicast routing algorithms can
               be combined hierarchically to support multicasting across
               large, heterogeneous internetworks.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1214/CS-TR-88-1214.pdf

%R CS-TR-88-1218
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Square Meshes are not always Optimal
%A Bar-Noy, Amotz
%A Peleg, David
%D August 1988
%X In this paper we consider mesh connected computers with
               multiple buses, providing broadcast facilities along rows and
               columns.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1218/CS-TR-88-1218.pdf

%R CS-TR-88-1225
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel Approximation Algorithms
%A Mayr, Ernst W.
%D September 1988
%X Many problems of great practical importance are hard to solve
               computationally, at least if exact solutions are required. We
               survey a number of (NP- or P-complete) problems for which
               fast parallel approximation algorithms are known: The 0-1
               knapsack problem, binpacking, the minimal makeshift problem,
               the list scheduling problem, greedy scheduling, and the high
               density subgraph problem. Algorithms for these problems are
               presented highlighting the underlying techniques and
               principles, and several types of parallel approximation
               schemes are exhibited.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1225/CS-TR-88-1225.pdf

%R CS-TR-88-1226
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Making Intelligent Systems Adaptive
%A Hayes-Roth, Barbara
%D October 1988
%X Contemporary intelligent systems are isolated
               problem-solvers. They accept particular classes of problems,
               reason about them, perhaps request additional information,
               and eventually produce solutions. By contrast, human beings
               and other intelligent animals continuously adapt to the
               demands and opportunities presented by a dynamic environment.
               Adaptation plays a critical role in everyday behaviors, such
               as conducting a conversation, as well as in sophisticated
               professional behaviors, such as monitoring critically ill
               medical patients. To make intelligent systems similarly
               adaptive, we must augment their reasoning capabilities with
               capabilities for perception and action. Equally important, we
               must endow them with an attentional mechanism to allocate
               their limited computational resources among competing
               perceptions, actions, and cognitions, in real time. In this
               paper, we discuss functional objectives for "adaptive
               intelligent systems," an architecture designed to achieve
               those objectives, and our continuing study of both objectives
               and architecture in the context of particular tasks.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1226/CS-TR-88-1226.pdf

%R CS-TR-88-1227
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Finding Minimum-Cost Flows by Double-Scaling
%A Ahuja, R. K.
%A Goldberg, A. V.
%A Orlin, J. B.
%A Tarjan, R. E.
%D October 1988
%X Several researchers have recently developed new techniques
               that give fast algorithms for the minimum-cost flow problem.
               In this paper we combine several of these techniques to yield
               an algorithm running in O(nm log log Ulog(nC)) time on
               networks with n vertices, m edges, maximum arc capacity U,
               and maximum arc cost magnitude C. The major techniques used
               are the capacity-scaling approach of Edmonds and Karp, the
               excess-scaling approach of Ahuja and Orlin, the cost-scaling
               approach Goldberg and Tarjan, and the dynamic tree data
               structure of Sleator and Tarjan. For nonsparse graphs with
               large maximum arc capacity, we obtain a similar but slightly
               better bound. We also obtain a slightly better bound for the
               (noncapacitated) transportation problem. In addition, we
               discuss a capacity-bounding approach to the minimum-cost flow
               problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1227/CS-TR-88-1227.pdf

%R CS-TR-88-1228
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Parallel Algorithm for Finding a Blocking Flow in an
               Acyclic Network
%A Goldberg, A. V.
%A Tarjan, R. E.
%D November 1988
%X We propose a simple parallel algorithm for finding a blocking
               flow in an acyclic network. On an n-vertex, m-arc network,
               our algorithm runs in O(n log n) time and O(nm) space using
               an m-processor EREW PRAM. A consequence of our algorithm is
               an O(n2 (log n) log (nC)-time, O(nm)-space, m-processor
               algorithm for the minimum-cost circulation problem, on a
               network with integer arc capacities of magnitude at most C.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1228/CS-TR-88-1228.pdf

%R CS-TR-88-1229
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Distributing Intelligence within an Individual
%A Hayes-Roth, B.
%A Hewett, M.
%A Washington, R.
%A Hewett, R.
%A Seiver, A.
%D November 1988
%X Distributed artificial intelligence (DAI) refers to systems
               in which decentralized, cooperative agents work
               synergistically to perform a task. Altemative specifications
               of DAI resemble particular biological or social systems, such
               as teams, contract nets, or societies. Our DAI model
               resembles a single individual comprising multiple loosely
               coupled agents for perception, action, and cognition
               functions. We demonstrate the DAI individual in the Guardian
               system for intensive-care monitoring and argue that it is
               more appropriate than the prevalent team model for a large
               class of similar applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1229/CS-TR-88-1229.pdf

%R CS-TR-88-1230
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Specification and Verification of Concurrent Programs by
               For-All Automata
%A Manna, Z ohar
%A Pnueli, Amir
%D November 1988
%X For-all automata are non-deterministic finite-state automata
               over infinite sequences. They differ from conventional
               automata in that a sequence is accepted if all runs of the
               automaton over the sequence are accepting. These automata are
               suggested as a formalism for the specification and
               verification of temporal properties of concurrent programs.
               It is shown that they are as expressive as extended temporal
               logic (ETL), and, in some cases, provide a more compact
               representation of properties than temporal logic. A
               structured diagram notation is suggested for the graphical
               representation of these automata. A single sound and complete
               proof rule is presented for proving that all computations of
               a program have the property specified by a for-all automaton.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1230/CS-TR-88-1230.pdf

%R CS-TR-88-1233
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Procedural Semantics for Well Founded Negation in Logic
               Programs
%A Ross, Kenneth A.
%D November 1988
%X We introduce global SLS-resolution, a procedural semantics
               for well-founded negation as defined by Van Gelder, Ross and
               Schlipf. Global SLS-resolution extends Przymusinski's
               SLS-resolution, and may be applied to all programs, whether
               locally stratified or not. Global SLS-resolution is defined
               in terms of global trees, a new data structure representing
               the dependence of goals on derived negative subgoals. We
               prove that global SLS-resolutlon is sound with respect to the
               well-founded semantics, and complete for non-floundering
               queries.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1233/CS-TR-88-1233.pdf

%R CS-TR-88-1234
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Average Number of Stable Matchings
%A Pittel, Boris
%D December 1988
%X The probable behavior of an instance of size n of the stable
               marriage problem, chosen uniformly at random, is studied.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1234/CS-TR-88-1234.pdf

%R CS-TR-88-1236
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Time for Action: On the Relation between Time, Knowledge, and
               Action
%A Shoham, Yoav
%D December 1988
%X We consider the role played by the concept of action in AI.
               We first briefly summarize the advantages and limitations of
               past approaches to taking the concept as primitive, as
               embodied in the situation calculus and dynamic logic. We also
               briefly summarize the alternative, namely adopting a temporal
               framework, and point out its complementary advantages and
               limitations. We then propose a framework that retains the
               advantages of both viewpoints, and that ties the notion of
               action closely to that of knowledge. Specifically, we propose
               starting with the notion of time lines, and defining the
               notion of action as the ability to make certain choices among
               sets of time lines. Our definitions shed new light on the
               connection between time, action, knowledge and ignorance,
               choice-making, feasibility, and simultaneous reasoning about
               the same events at different levels of detail.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1236/CS-TR-88-1236.pdf

%R CS-TR-88-1237
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Belief as Defeasible Knowledge
%A Shoham, Yoav
%A Moses, Yoram
%D December 1988
%X We investigate the relation between the notions of knowledge
               and belief. Contrary to the well-known slogan about knowledge
               being "justified, true belief," we propose that belief be
               viewed as defeasible knowledge. Specifically, we offer a
               definition of belief as knowledge-relative-to-assumptions,
               and tie the definition to the notion of nonmonotonicity. Our
               definition has several advantages. First, it is short.
               Second, we do not need to add anything to the logic of
               knowledge: the right properties of belief fall out of the
               definition and the properties of knowledge. Third, the
               connection between knowledge and belief is derived from one
               fundamental principle, which is more enlightening than a
               collection of arbitrary-seeming axioms relating the two
               notions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1237/CS-TR-88-1237.pdf

%R CS-TR-88-1239
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sorting, Minimal Feedback Sets and Hamilton Paths in
               Tournaments
%A Bar-Noy, Amotz
%A Naor, Joseph
%D December 1988
%X We present a general method for translating sorting by
               comparisons algorithms to algorithms that compute a Hamilton
               path in a tournament. The translation is based on the
               relation between minimal feedback sets and Hamilton paths in
               tournaments. We prove that there is a one to one
               correspondence between the set of minimal feedback sets and
               the set of Hamilton paths. In the comparison model, all the
               tradeoffs for sorting between the number of processors and
               the number of rounds hold when a Hamilton path is computed.
               For the CRCW model, with O(n) processors, we show the
               following: (i) Two paths in a tournament can be merged in
               O(log log n) time (Valiant's algorithm): (ii) a Hamilton path
               can be computed in O (log n) time (Cole's algorithm). This
               improves a previous algorithm for computing a Hamilton path.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1239/CS-TR-88-1239.pdf

%R CS-TR-88-1240
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On Separat1ng the EREW and CREW PRAM Models
%A Gafni, E.
%A Naor, J.
%A Ragde, P.
%D December 1988
%X In [6], Snir proposed the Selection Problem (searching in a
               sorted table) to show that the CREW PRAM is strictly more
               powerful than the EREW PRAM. This problem defines a partial
               function, that is, one that is defined only on a restricted
               set of inputs. Recognizing whether an arbitrary input belongs
               to this restricted set is hard for both CREW and EREW PRAMs.
               The existence of a total function that exhibits the power of
               the CREW model over the EREW model was an open problem. Here
               we solve this problem by generalizing the Selection problem
               to a Decision Tree problem which is defined on a full domain
               and to which Snir's lower bound applies.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/88/1240/CS-TR-88-1240.pdf

%R CS-TR-85-1035
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T RESIDUE: a deductive approach to design synthesis
%A Finger, J. J.
%A Genesereth, Michael R.
%D January 1985
%X We present a new approach to deductive design synthesis, the
               Residue Approach, in which designs are represented as sets of
               constraints. Previous approaches, such as PROLOG [18] or the
               work of Manna and Waldinger [11], express designs as bindings
               on single terms. We give a complete and sound procedure for
               finding sets of propositions constituting a legal design. The
               size of the search space of the procedure and the advantages
               and disadvantages of the Residue Approach are analysed. In
               particular we show how Residue can avoid backtracking caused
               by making design decisions of overly coarse granularity. In
               contrast, it is awkward for the single term approaches to do
               the same. In addition we give a rule for constraint
               propagation in deductive synthesis, and show its use in
               pruning the design space. Finally, Residue is related to
               other work, in particular, to Default Logic [16] and to
               Assumption-Based Truth Maintenance [1].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1035/CS-TR-85-1035.pdf

%R CS-TR-85-1036
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Learning control heuristics in BB1
%A Hayes-Roth, Barbara
%A Hewett, Micheal
%D January 1985
%X BB1, a blackboard system building architecture, ameliorates
               the knowledge acquisition bottleneck with generic knowledge
               sources that learn control heuristics. Some learning
               knowledge sources replace the knowledge engineer, interacting
               directly with domain experts. Others operate autonomously.
               The paper presents a trace from the illustrative knowledge
               source. Understand-Preference, running in PROTEAN, a
               blackboard system for elucidating protein structure.
               Understand-Preference is triggered when a domain expert
               overrides one of BB1's scheduling recommendations. It
               identifies and encodes the heuristic underlying the expert's
               scheduling decision. The trace illustrates how learning
               knowledge sources exploit BB1's rich representation of domain
               and control knowledge, actions, and resuits.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1036/CS-TR-85-1036.pdf

%R CS-TR-85-1037
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Expressiveness and language choice
%A MacKinlay, Jock
%A Genesereth, Michael R.
%D January 1985
%X Specialized languages are often more appropriate than general
               languages for expressing certain information. However,
               specialized languages must be chosen carefully because they
               do not allow all sets of facts to be stated. This paper
               considers the problems associated with choosing among
               specialized languages. Methods are presented for determining
               that a set of facts is expressible in a language, for
               identifying when additional facts are stated accidentally,
               and for choosing among languages that can express a set of
               facts. This research is being used to build a system that
               automatically chooses an appropriate graphical language to
               present a given set of facts.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1037/CS-TR-85-1037.pdf

%R CS-TR-85-1038
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Uniform hashing is optimal
%A Yao, Andrew C.
%D January 1985
%X It was conjectured by J. Ullman that uniform hashing is
               optimal in its expected retrieval cost among all open-address
               hashing schemes (JACM 19 (1972), 569-575). In this paper we
               show that, for any open-address hashing scheme, the expected
               cost of retrieving a record from a large table which is
               alpha-fraction full is at least 1/alpha log 1/1-alpha + o(1).
               This proves Ullman's conjecture to be true in the asymptotic
               sense.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1038/CS-TR-85-1038.pdf

%R CS-TR-85-1043
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Constructing a perfect matching is in Random NC
%A Karp, Richard M.
%A Upfal, Eli
%A Wigderson, Avi
%D March 1985
%X We show that the problem of constructing a perfect matching
               in a graph is in the complexity class Random NC: i.e., the
               problem is solvable in polylog time by a randomized parallel
               algorithm using a polynomial-bounded number of processors. We
               also show that several related problems lie in Random NC.
               These include:
               (i) Constructing a perfect matching of maximum weight in a
               graph whose edge weights are given in unary notation;
               (ii) Constructing a maximum-cardinality matching;
               (iii) Constructing a matching covering a set of vertices of
               maximum weight in a graph whose vertex weights are given in
               binary;
               (iv) Constructing a maximum s-t flow in a directed graph
               whose edge weights are given in unary.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1043/CS-TR-85-1043.pdf

%R CS-TR-85-1047
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Smooth, easy to compute interpolating splines
%A Hobby, John D.
%D January 1985
%X We present a system of interpolating splines with first and
               approximate second order geometric continuity. The curves are
               easily computed in linear time by solving a system of linear
               equations without the need to resort to any kind of
               successive approximation scheme. Emphasis is placed on the
               need to find aesthetically pleasing curves in a wide range of
               circumstances; favorable results are obtained even when the
               knots are very unequally spaced or widely separated. The
               curves are invariant under scaling, rotation, and reflection,
               and the effects of a local change fall off exponentially as
               one moves away from the disturbed knot.
               Approximate second order continuity is achieved by using a
               linear "mock curvature" function in place of the actual
               endpoint curvature for each spline segment and choosing
               tangent directions at knots so as to equalize these. This
               avoids extraneous solutions and other forms of undesirable
               behavior without seriously compromising the quality of the
               results.
               The actual spline segments can come from any family of curves
               whose endpoint curvatures can be suitably approximated, but
               we propose a specific family of parametric cubics. There is
               freedom to allow tangent directions and "tension" parameters
               to be specified at knots, and special "curl" parameters may
               be given for additional control near the endpoints of open
               curves.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1047/CS-TR-85-1047.pdf

%R CS-TR-85-1048
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Some constructions for order-theoretic models of concurrency
%A Pratt, Vaughan
%D March 1985
%X We give "tight" and "loose" constructions suitable for
               specifying processes represented as sets of pomsets
               (partially ordered multisets). The tight construction is
               suitable for specifying "primitive" processes; it introduces
               the dual notions of concurrence and orthocurrence. The loose
               construction specifies a process in terms of a net of
               communicating subprocesses; it introduces lhe notion of a
               utilization embedding a process in a net.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1048/CS-TR-85-1048.pdf

%R CS-TR-85-1049
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The pomset model of parallel processes: unifying the temporal
               and the spatial
%A Pratt, Vaughan
%D January 1985
%X No abstract.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1049/CS-TR-85-1049.pdf

%R CS-TR-85-1050
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast sequential algorithms to find shuffle-minimizing and
               shortest paths in a shuffle-exchange network
%A Hershberger, John
%A Mayr, Ernst
%D May 1985
%X This paper analyzes the problem of finding shortest paths and
               shuffle-minimizing paths in an n-node shuffle-exchange
               network, where n = $2^m$. Such paths have the properties
               needed by the Valiant-Brebner permutation routing algorithm,
               unlike the trivial (m - 1)-shuffle paths usually used for
               shuffle-exchange routing. The Valiant-Brebner algorithm
               requires n simultaneous route computations, one for each
               packet to be routed, which can be done in parallel. We give
               fast sequential algorithms for both problems we consider.
               Restricting the shortest path problem to allow only paths
               that use fewer than m shuffles provides intuition applicable
               to the general problem. Linear-time pattern matching
               techniques solve part of the restricted problem; as a
               consequence, a path using fewest shuffles can be found in
               O(m) time, which is optimal up to a constant factor. The
               shortest path problem is equivalent to the problem of finding
               the Hamming distances between a bitstring and all shifted
               instances of another. An application of the fast Fourier
               transform solves this problem and the shortest path problem
               in O(m log m) time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1050/CS-TR-85-1050.pdf

%R CS-TR-85-1051
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Special relations in automated deduction
%A Manna, Z ohar
%A Waldinger, Richard
%D May 1985
%X Two deduction rules are introduced to give streamlined
               treatment to relations of special importance in an automated
               theorem-proving system. These rules, the relation replacement
               and relation matching rules, generalize to an arbitrary
               binary relation the paramodulation and E-resolution rules,
               respectively, for equality, and may operate within a
               nonclausal or clausal system. The new rules depend on an
               extension of the notion of polarity to apply to subterms as
               well as to subsentences, with respect to a given binary
               relation. The rules allow us to eliminate troublesome axioms,
               such as transitivity and monotonicity, from the system;
               proofs are shorter and more comprehensible, and the search
               space is correspondingly deflated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1051/CS-TR-85-1051.pdf

%R CS-TR-85-1053
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Transaction classification to survive a network partition
%A Apers, Peter M. G.
%A Wiederhold, Gio
%D August 1984
%X When comparing centralized and distributed databases one of
               the advantages of distributed databases is said to be the
               greater availability of the data. Availability is defined as
               having access to the stored data for update and retrieval,
               even when some distributed sites are down due to hardware
               failures. We will investigate the functioning of a
               distributed database of which the underlying computer network
               may fail. A classification of transactions is given to allow
               an implementation of different levels of operatability. Some
               transactions can be guaranteed to commit in spite of a
               network partition, while others have to wait until the state
               of potential transactions in the other partitions is also
               known. An algorithm is given to compute a classification.
               Based on historics of transactions kept in the different
               partitions a merge of histories is computed, generating the
               new values for some data items when communication is
               re-established. Thc algorithm to compute the merge of the
               histories makes use of a knowledge base containing knowledge
               about the transactions, to decide whether to merge, delete,
               or delay a transaction.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1053/CS-TR-85-1053.pdf

%R CS-TR-85-1055
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Programming and Problem-Solving Seminar
%A Haddad, Ramsey W.
%A Knuth, Donald E.
%D June 1985
%X This report contains edited transcripts of the discussions
               held in Stanford's course CS204, Problem Seminar, during
               winter quarter 1985. Since the topics span a large range of
               ideas in computer science, and since most of the important
               research paradigms and programming paradigms were touched on
               during the discussions, these notes may be of interest to
               graduate students of computer science at other universities,
               as well as to their professors and to professional people in
               the "real world."
               The present report is the sixth in a series of such
               transcripts, continuing the tradition established in
               STAN-CS-77-606 (Michael J. Clancy, 1977), STAN-CS-79-707
               (Chris Van Wyk, 1979), STAN-CS-81-863 (Allan A. Miller,
               1981), STAN-CS-83-989 (Joseph S. Weening, 1983),
               STAN-CS-83-990 (John D. Hobby, 1983).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1055/CS-TR-85-1055.pdf

%R CS-TR-85-1056
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Nonclausal temporal deduction
%A Abadi, Martin
%A Manna, Z ohar
%D June 1985
%X We present a proof system for propositional temporal logic.
               This system is based on nonclausal resolution; proofs are
               natural and generally short. Its extension to first-order
               temporal logic is considered.
               Two variants of the system are described. The first one is
               for a logic with $\Box$ ("always"), $\Diamond$ ("sometime"),
               and $\bigcirc$ ("next"). The second variant is an extension
               of the first one to a logic with the additional operators U
               ("until") and P ("precedes"). Each of these variants is
               proved complete.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1056/CS-TR-85-1056.pdf

%R CS-TR-85-1058
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Host groups: a multicast extension for datagram internetworks
%A Cheriton, David R.
%A Deering, Stephen E.
%D July 1985
%X The extensive use of local networks is beginning to drive
               requirements for internetwork facilities that connect these
               local networks. In particular, the availability of multicast
               addressing in many local networks and its use by
               sophisticated distributed applications motivates providing
               multicast across internetworks.
               In this paper, we propose a model of service for multicast in
               an internetwork, describe how this service can be used, and
               describe aspects of its implementation, including how it
               would fit into one existing internetwork architecture, namely
               the US DoD Internet Architecture.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1058/CS-TR-85-1058.pdf

%R CS-TR-85-1062
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computer Science comprehensive examinations, 1981/82-1984/85
%A Keller, Arthur M.
%D August 1985
%X This report is a collection of the eight comprehensive
               examinations from Winter 1982 through Spring 1985 prepared by
               the faculty and students of Stanford's Computer Science
               Department together with solutions to the problems posed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1062/CS-TR-85-1062.pdf

%R CS-TR-85-1065
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Review of Sowa's "Conceptual Structures"
%A Clancey, William J.
%D March 1985
%X "Conceptual Structures" is a bold, provocative synthesis of
               logic, linguistics, and Artificial Intelligence research. At
               the very least, Sowa has provided a clean, well-grounded
               notation for knowledge representation that many researchers
               will want to emulate and build upon. At its best, Sowa's
               notation and proofs hint at what a future Principia
               Mathematica of knowledge and reasoning may look like. No
               other AI text achieves so much in breadth, style, and
               mathematical precision. This is a book that everyone in AI
               and cognitive science should know about, and that experienced
               researchers will profit from studying in some detail.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1065/CS-TR-85-1065.pdf

%R CS-TR-85-1066
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Heuristic classification
%A Clancey, William J.
%D June 1985
%X A broad range of well-structured problems--embracing forms of
               diagnosis, catalog selection, and skeletal planning--are
               solved in "expert systems" by the method of heuristic
               classification. These programs have a characteristic
               inference structure that systematically relates data to a
               pre-enumerated set of solutions by abstraction, heuristic
               association, and refinement. In contrast with previous
               descriptions of classification reasoning, particularly in
               psychology, this analysis emphasizes the role of a heuristic
               in routine problem solving as a non-hierarchical, direct
               association between concepts. In contrast with other
               descriptions of expert systems, this analysis specifies the
               knowledge needed to solve a problem, independent of its
               representation in a particular computer language. The
               heuristic classification problem-solving model provides a
               useful framework for characterizing kinds of problems, for
               designing representation tools, and for understanding
               non-classification (constructive) problem-solving methods.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1066/CS-TR-85-1066.pdf

%R CS-TR-85-1067
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Acquiring, representing, and evaluating a competence model of
               diagnostic strategy
%A Clancey, William J.
%D August 1985
%X NEOMYCIN is a computer program that models one physician's
               diagnostic reasoning within a limited area of medicine.
               NEOMYCIN's diagnostic procedure is represented in a
               well-structured way, separately from the domain knowledge it
               operates upon. We are testing the hypothesis that such a
               procedure can be used to simulate both expert problem-solving
               behavior and a good teacher's explanations of reasoning.
               The model is acquired by protocol analysis, using a framework
               that separates an expert's causal explanations of evidence
               from his descriptions of knowledge relations and strategies.
               The model is represented by a procedural network of goals and
               rules that are stated in terms of the effect the problem
               solver is trying to have on his evolving model of the world.
               The model is evaluated for sufficiency by testing it in
               different settings requiring expertise, such as providing
               advice and teaching. The model is evaluated for plausibility
               by arguing that the constraints implicit in the diagnostic
               procedure are imposed by the task domain and human
               computational capability.
               This paper discusses NEOMYCIN's diagnostic procedure in
               detail, viewing it as a memory aid, as a set of operators, as
               proceduralized constraints, and as a grammar. This study
               provides new perspectives on the nature of "knowledge
               compilation" and how an expert-teacher's explanations relate
               to a working program.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1067/CS-TR-85-1067.pdf

%R CS-TR-85-1068
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T GUIDON-WATCH: a graphic interface for viewing a
               knowledge-based system
%A Richer, Mark H.
%A Clancey, William J.
%D August 1985
%X This paper describes GUIDON-WATCH, a graphic interface that
               uses multiple windows and a mouse to allow a student to
               browse a knowledge base and view reasoning processes during
               diagnostic problem solving. Methods are presented for
               providing multiple views of hierarchical structures,
               overlaying results of a search process on top of static
               structures to make the strategy visible, and graphically
               expressing evidence relations between findings and
               hypotheses. This work demonstrates the advantages of stating
               a diagnostic search procedure in a well-structured,
               rule-based language, separate from domain knowledge. A number
               of issues in software design are also considered, including
               the automatic management of a multiple-window display.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1068/CS-TR-85-1068.pdf

%R CS-TR-85-1072
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Programming and Problem-Solving Seminar
%A Mayr, Ernst W.
%A Anderson, Richard J.
%A Hochschild, Peter H.
%D October 1985
%X This report contains edited transcripts of the discussions
               held in Stanford's course CS204, Problem Seminar, during
               winter quarter 1984. The course topics consisted of five
               problems coming from different areas of computer science. The
               problems were discussed in class and solved and programmed by
               the students working in teams.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1072/CS-TR-85-1072.pdf

%R CS-TR-85-1074
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Designing new typefaces with Metafont
%A Southall, Richard
%D September 1985
%X The report discusses issues associated with the symbolic
               design of new typefaces using programming languages such as
               Metafont.
               A consistent terminology for the subject area is presented. A
               schema for type production systems is described that lays
               stress on the importance of communication between the
               designer of a new typeface and the producer of the fonts that
               embody it. The methods used for the design of printers' type
               from the sixteenth century to the present day are surveyed in
               the context of this schema.
               The differences in the designer's task in symbolic and
               graphic design modes are discussed. A new typeface design
               made with Metafont is presented, and the usefulness of
               Metafont as a tool for making new designs considered.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1074/CS-TR-85-1074.pdf

%R CS-TR-85-1075
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Expert systems: working systems and the research literature
%A Buchanan, Bruce G.
%D October 1985
%X Expert systems are the subject of considerable interest among
               persons in AI research or applications. There is no single
               definition of an expert system, and thus no precisely defined
               set of programs or set of literature references that
               represent work on expert systems. This report provides (a) a
               characterization of what an expert systems is, (b) a list of
               expert systems in routine use or field testing, and (c) a
               list of relevant references in the AI research literature.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1075/CS-TR-85-1075.pdf

%R CS-TR-85-1076
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Some approaches to knowledge acquisition
%A Buchanan, Bruce G.
%D July 1985
%X Knowledge acquisition is not a single, monolithic problem for
               AI. There are many ways to approach the topic in order to
               understand issues and design useful tools for constructing
               knowledge-based systems. Several of those approaches are
               being explored in the Knowledge Systems Laboratory (KSL) at
               Stanford.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1076/CS-TR-85-1076.pdf

%R CS-TR-85-1079
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Two processor scheduling is in NC
%A Helmbold, David
%A Mayr, Ernst
%D October 1985
%X We present a parallel algorithm for the two processor
               scheduling problem. This algorithm constructs an optimal
               schedule for unit execution time task systems with arbitrary
               precedence constraints using a polynomial number of
               processors and running in time polylog in the size of the
               input. Whereas previous parallel solutions for the problem
               made extensive use of randomization, our algorithm is
               completely deterministic and based on an interesting
               decomposition technique. And it is of independent relevance
               for two more reasons. It provides another example for the
               apparent difference in complexity between decision and search
               problems in the context of fast parallel computation, and it
               gives an NC-algorithm for the matching problem in certain
               restricted cases.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1079/CS-TR-85-1079.pdf

%R CS-TR-85-1084
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Taliesin: a distributed bulletin board system
%A Edighoffer, Judy L.
%A Lantz, Keith A.
%D September 1985
%X This paper describes a computer bulletin board facility
               intended to support replicated bulletin boards on a network
               that may frequently be in a state of partition. The two major
               design issues covered are the choice of a name space and the
               choice of replication algorithms. The impact of the name
               space on communication costs is explained. A special purpose
               replication algorithm that provides high availability and
               response despite network partition is introduced.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1084/CS-TR-85-1084.pdf

%R CS-TR-85-1086
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Towards a universal directory service
%A Lantz, Keith A.
%A Edighoffer, Judy L.
%A Hitson, Bruce L.
%D August 1985
%X Directory services and name servers have been discussed and
               implemented for a number of distributed systems. Most have
               been tightly interwoven with the particular distributed
               systems of which they are a part; a few are more general in
               nature. In this paper we survey recent work in this area and
               discuss the advantages and disadvantages of a number of
               approaches. From this, we are able to extract some
               fundamental requirements of a naming system capable of
               handling a wide variety of object types in a heterogeneous
               environment. We outline how these requirements can be met in
               a universai directory service.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1086/CS-TR-85-1086.pdf

%R CS-TR-85-1087
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Preemptable remote execution facilities for the V-system
%A Theimer, Marvin M.
%A Lantz, Keith A.
%A Cheriton, David R.
%D September 1985
%X A remote execution facility allows a user of a
               workstation-based distributed system to offload programs onto
               idle workstations, thereby providing the user with access to
               computational resources beyond that provided by his personal
               workstation. In this paper, we describe the design and
               performance of the remote execution facility in the V
               distributed system, as well as several implementation issues
               of interest. In particular, we focus on network transparency
               of the execution environment, preemption and migration of
               remotely executed programs, and avoidance of residual
               dependencies on the original host. We argue that preemptable
               remote execution allows idle workstations to be used a a
               "pool of processors" without interfering with use by their
               owners and without significant overhead for the normal
               execution of programs. In general, we conclude that the cost
               of providing preemption is modest compared to providing a
               similar amount of computation service by dedicated
               "computation engines".
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1087/CS-TR-85-1087.pdf

%R CS-TR-85-1080
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The compleat guide to MRS
%A Russell, Stuart
%D June 1985
%X MRS is a logic programming system with extensive meta-level
               facilities. As such it can be used to implement virtually all
               kinds of artificial intelligence applications in a wide
               variety of architectures. This guide is intended to be a
               comprehensive text and reference for MRS. It also attempts to
               explain the foundations of the logic programming approach
               from the ground up, and it is hoped that it will thus provide
               access, even for the uninitiated, to all the benefits of AI
               methods. The only prerequisites for understanding MRS are a
               passing acquaintance with LISP and an open mind.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/85/1080/CS-TR-85-1080.pdf

%R CS-TR-86-1085
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Bibliography of Computer Science reports, 1963-1986
%A Berg, Kathryn A.
%A Marashian, Taleen
%D June 1986
%X This report lists, in chronological order, all reports
               published by the Stanford Computer Science Department since
               1963. Each report is identified by a Computer Science number,
               author's name, title, National Technical Information Service
               (NTIS) retrieval number (i.e., AD-XXXXXX), date, and number
               of pages.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1085/CS-TR-86-1085.pdf

%R CS-TR-86-1093
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A general reading list for artificial intelligence
%A Subramanian, Devika
%A Buchanan, Bruce G.
%D December 1985
%X This reading list is based on thc syllabus for the course
               CS229b offered in Winter 1985. This course was an intensive
               10 week survey intended as preparation for the 1984-85
               qualifying examination in Artificial Intelligence at Stanford
               University.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1093/CS-TR-86-1093.pdf

%R CS-TR-86-1094
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Expert systems: working systems and the research literature
%A Buchanan, Bruce G.
%D December 1985
%X Many expert systems have moved out of development
               laboratories into field test and routine use. About sixty
               such systems are listed. Academic research laboratories are
               contributing manpower to fuel the commercial development of
               AI. But the quantity of AI research may decline as a result
               unless the applied systems are experimented with and
               analyzed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1094/CS-TR-86-1094.pdf

%R CS-TR-86-1095
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A torture test for METAFONT
%A Knuth, Donald E.
%D January 1986
%X Programs that claim to be implementations of METAFONT84 are
               supposed to be able to process the test routine contained in
               this report, producing the outputs contained in this report.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1095/CS-TR-86-1095.pdf

%R CS-TR-86-1096
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A model-theoretic approach to updating logical databases
%A Wilkins, Marianne Winslett
%D January 1986
%X We show that it is natural to extend the concept of database
               updates to encompass databases with incomplete information.
               Our approach embeds the incomplete database and the updates
               in the language of first-order logic, which we believe has
               strong advantages over relational tables and traditional data
               manipulation languages in the incomplete information
               situation. We present semantics for our update operators, and
               also provide an efficient algorithm to perform the
               operations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1096/CS-TR-86-1096.pdf

%R CS-TR-86-1097
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T TEXware.
%A Knuth, Donald E.
%D April 1986
%X This report documents four TEX utility programs: The POOLtype
               processor (Version 2, July 1983), The TFtoPL processor
               (Version 2.5, September 1985), The PLtoTF processor (Version
               2.3, August 1985), and The DVItype processor (Version 2.8,
               August 1984).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1097/CS-TR-86-1097.pdf

%R CS-TR-86-1100
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Model theorem proving
%A Abadi, Martin
%A Manna, Z ohar
%D May 1986
%X We describe resolution proof systems for several modal
               logics. First we present the propositional versions of the
               systems and prove their completeness. The first-order
               resolution rule for classical logic is then modified to
               handle quantifiers directly. This new resolution rule enables
               us to extend our propositional systems to complete
               first-order systems. The systems for the different modal
               logics are closely related.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1100/CS-TR-86-1100.pdf

%R CS-TR-86-1102
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Data independent recursion in deductive databases
%A Naughton, Jeffrey F.
%D February 1986
%X Some recursive definitions in deductive database systems can
               be replaced by equivalent nonrecursive definitions. In this
               paper we give a linear-time algorithm that detects many such
               definitions, and specify a useful subset of recursive
               definitions for which the algorithm is complete. It is
               unlikely that our algorithm can be extended significantly, as
               recent results by Gaifman [5] and Vardi [19] show that the
               general problem is undecidable. We consider two types of
               initialization of the recursively defined relation: arbitrary
               initialization, and initialization by a given nonrecursive
               rule. This extends earlier work by Minker and Nicolas [10],
               and by Ioannidis [7], and is related to bounded tableau
               results by Sagiv [14]. Even if there is no equivalent
               equivalent nonrecursive definition, a modification of our
               algorithm can be used to optimize a recursive definition and
               improve the efficiency of the compiled evaluation algorithms
               proposed in Henschen and Naqvi [6] and in Bancilhon et al.
               [3].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1102/CS-TR-86-1102.pdf

%R CS-TR-86-1104
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T CS229b: a survey of AI classnotes for Winter 84-85
%A Subramanian, Devika
%D April 1986
%X These are the compiled classnotes for the course CS229b
               offered in Winter 1985. This course was an intensive 10 week
               survey intended as preparation for the 1984-85 qualifying
               examination in Artificial Intelligence at Stanford
               University.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1104/CS-TR-86-1104.pdf

%R CS-TR-86-1105
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Software-controlled caches in the VMP multiprocessor
%A Cheriton, David R.
%A Slavenburg, Gert A.
%A Boyle, Patrick D.
%D March 1986
%X VMP is an experimental multiprocessor that follows the
               familiar basic design of multiple processors, each with a
               cache, connected by a shared bus to global memory. Each
               processor has a synchronous, virtually addressed, single
               master connection to its cache, providing very high memory
               bandwidth. An unusually large cache page size and fast
               sequential memory copy hardware make it feasible for cache
               misses to be handled in software, analogously to the handling
               of virtual memory page faults. Hardware support for cache
               consistency is limited to a simple state machine that
               monitors the bus and interrupts the processor when a cache
               consistency action is required.
               In this paper, we show how the VMP design provides the high
               memory bandwidth required by modern high-performance
               processors with a minimum of hardware complexity and cost. We
               also describe simple solutions to the consistency problems
               associated with virtually addressed caches. Simulation
               results indicate that the design achieves good performance
               providing data contention is not excessive.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1105/CS-TR-86-1105.pdf

%R CS-TR-86-1106
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A timely resolution
%A Abadi, Martin
%A Manna, Z ohar
%D April 1986
%X We present a novel proof system R for First-order (Linear)
               Temporal Logic. This system extends our Propositional
               Temporal Logic proof system ([AM]). The system R is based on
               nonclausal resolution; proofs are natural and generally
               short. Special quantifier rules, unification techniques, and
               a resolution rule are introduced. We relate R to other proof
               systems for First-order Temporal Logic and discuss
               completeness issues. The system R should be useful as a tool
               for such tasks as verification of concurrent programs and
               reasoning about hardware devices.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1106/CS-TR-86-1106.pdf

%R CS-TR-86-1109
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A proof editor for propositional temporal logic
%A Casley, Ross
%D May 1986
%X This report describes PTL, a program to assist in
               constructing proofs in propositional logic extended by the
               operators $\Box$ ("always"), $\Diamond$ ("eventually") and
               $\bigcirc$ ("at the next step"). This is called propositional
               temporal logic and is one of two systems of logic presented
               by Abadi and Manna in [1].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1109/CS-TR-86-1109.pdf

%R CS-TR-86-1114
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Optimizing function-free recursive inference rules
%A Naughton, Jeffrey F.
%D May 1986
%X Recursive inference rules arise in recursive definitions in
               logic programming systems and in database systems with
               recursive query languages. Let D be a recursive definition of
               a relation t. We say that D is minimal if for any predicate p
               in a recursive rule in D, p must appear in a recursive rule
               in any definition of t. We show that testing for minimality
               is in general undecidable. However, we do present an
               efficient algorithm for a useful class of recursive rules,
               and show how to use it to transform a recursive definition to
               a minimal recursive definition. Evaluating the optimized
               definition will avoid redundant computation without the
               overhead of caching intermediate results and run-time
               checking for duplicate goals.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1114/CS-TR-86-1114.pdf

%R CS-TR-86-1115
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The heuristic refinement method for deriving solution
               structures of proteins
%A Buchanan, Bruce G.
%A Hayes-Roth, Barbara
%A Lichtarge, Olivier
%A Altman, Russ
%A Brinkley, James
%A Hewett, Micheal
%A Cornelius, Craig
%A Duncan, Bruce
%A Jardetzky, Oleg
%D March 1986
%X A new method is presented for determining structures of
               proteins in solution. The method uses constraints inferred
               from analytic data to successively refine both the locations
               for parts of the structure and the levels of detail for
               describing those parts. A computer program, called PROTEAN,
               which encodes this method, has been partially implemented and
               was used to derive structures for the lac-repressor headpiece
               from experimental data.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1115/CS-TR-86-1115.pdf

%R CS-TR-86-1116
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Inductive knowledge acquisition for rule-based expert systems
%A Fu, Li-Min
%A Buchanan, Bruce G.
%D October 1985
%X The RL program was developed to construct knowledge bases
               automatically in rule-based expert systems, primarily in
               MYCIN-like evidence-gathering systems where there is
               uncertainty about data as well as the strength of inference,
               and where rules are chained together or combined to infer
               complex hypotheses. This program comprises three subprograms:
               (1) a program that learns confirming rules, which employs a
               heuristic search commencing with the most general hypothesis;
               (2) a subprogram that learns rules containing intermediate
               concepts, which exploits the old partial knowledge or defines
               new intermediate concepts, based on heuristics; (3) a program
               that learns disconfirming rules, which is based on the
               expert's heuristics to formulate disconfirming rules. RL's
               validity has been demonstrated with a performance program
               that diagnoses the causes of jaundice.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1116/CS-TR-86-1116.pdf

%R CS-TR-86-1117
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Empirical Study of Distributed Application Performance
%A Lantz, Keith
%A Nowicki, William
%A Theimer, Marvin
%D October 1985
%X A major reason for the rarity of distributed applications,
               despite the proliferation of networks, is the sensitivity of
               their performance to various aspects of the network
               environment. We demonstrate that distributed applications can
               run faster than local ones, using common hardware. We also
               show that the primary factors affecting performance are, in
               approximate order of importance: speed of the user's
               workstation, speed of the remote host (if any), and the
               high-level (above the transport level) protocols used. In
               particular, the use of batching pipelining, and structure in
               high-level protocols reduces the degradation often
               experienced between different bandwidth networks. Less
               significant, but still noticeable improvements result from
               proper design and implementation of underlying transport
               protocols. Ultimately, with proper application of these
               techniques, network bandwidth is rendered virtually
               insignificant.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1117/CS-TR-86-1117.pdf

%R CS-TR-86-1118
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Applications of Parallel Scheduling to Perfect Graphs
%A Helmbold, David
%A Mayr, Ernst
%D June 1986
%X We combine a parallel algorithm for the two processor
               scheduling problem, which runs in polylog time on a
               polynomial number of processors, with an algorithm to find
               transitive orientations of graphs where they exist. Both
               algorithms together solve the maximum clique problem and the
               minimum coloring problem for comparability graphs, and the
               maximum matching problem for co-comparability graphs. These
               parallel algorithms can also be used to identify permutation
               graphs and interval graphs, important subclasses of perfect
               graphs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1118/CS-TR-86-1118.pdf

%R CS-TR-86-1119
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Simulation of an Ultracomputer with Several 'Hot Spots'
%A Rosenblum, David S.
%A Mayr, Ernst W.
%D June 1986
%X This report describes the design and results of a time-driven
               simulation of an Ultracomputer-like multiprocessor in the
               presence of several "hot spots," or memory modules which are
               frequent targets of requests. Such hot spots exist during
               execution of parallel programs in which the several threads
               of control synchronize through manipulation of a small number
               of shared variables. The simulated system is comprised of N
               processing elements (PEs) and N shared memory modules
               connected by an N x N buffered, packet-switched Omega
               network.
               The simulator was designed to accept a wide variety of system
               configurations to enable observation of many different
               characteristics of the system behavior. We present the
               results of four experiments: (1) General simulation of
               several 16-PE configurations, (2) General simulation of
               several 512-PE configurations, (3) Determination of critical
               queue lengths as a function of request rate (512 PEs) and (4)
               Determination of the effect of hot spot spacing on system
               performance (512 PEs).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1119/CS-TR-86-1119.pdf

%R CS-TR-86-1123
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Blackboard Systems
%A Nii, H. Penny
%D June 1986
%X The first blackboard system was the HEARSAY-II speech
               understanding system that evolved between 1971 and 1976.
               Subsequently, many systems have been built that have similar
               system organizations and run-time behavior. The objectives of
               this document are: (1) to define what is meant by "blackboard
               systems," and (2) to show the richness and diversity of
               blackboard system designs. The article begins with a
               discussion of the underlying concept behind all blackboard
               systems, the blackboard model of problem solving. In order to
               bridge the gap between a model and working systems, the
               blackboard framework, an extension of the basic blackboard
               model is introduced, including a detailed description of the
               model's components and their behavior. A model does not come
               into existence on its own and is usually an abstraction of
               many examples. In section 2, the history of ideas is traced
               and the designs of some applications systems that helped
               shape the blackboard model are detailed. We then describe and
               contrast existing blackboard systems. Blackboard systems can
               generally be divided into two categories; application and
               skeletal systems. In application systems the blackboard
               system components are integrated with the domain knowledge
               required to solve the problem at hand. Skeletal systems are
               devoid of domain knowledge, and, as the name implies, consist
               of the essential system components from which application
               systems can be built by the addition of knowledge and the
               specification of control (i.e. meta-knowledge). Application
               systems will be discussed in Section 3, and skeletal systems
               will be discussed elsewhere. In Section 3.6, we summarize the
               features of the applications systems and in Section 4 present
               the author's perspective on the utility of the blackboard
               approach to problem solving and knowledge engineering.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1123/CS-TR-86-1123.pdf

%R CS-TR-86-1124
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Efficient Matching Algorithms for the SOARIOPSS Production
               System
%A Scales, Daniel J.
%D June 1986
%X SOAR is a problem-solving and learning program intended to
               exhibit intelligent behavior. SOAR uses a modified form of
               the OPS5 production system for storage of and access to
               long-term knowledge. As with most programs which use
               production system systems, the match phase of SOAR's
               production system dominates all other SOAR processing. This
               paper describes the results of an investigation of various
               ways of speeding up the matching process in SOAR through
               additions and changes to the OPS5 matching algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1124/CS-TR-86-1124.pdf

%R CS-TR-86-1125
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The CAOS System
%A Schoen, Eric
%D March 1986
%X The CAOS system is a framework designed to facilitate the
               development of highly concurrent real-time signal
               interpretation applications. It explores the potential of
               multiprocessor architectures to improve the performance of
               expert systems in the domain of signal interpretation.
               CAOS is implemented in Lisp on a (simulated) collection of
               processor-memory sites, linked by a high-speed
               communiications subsystem. The "virtual machine" on which it
               depends provides remote evaluation and packet-based message
               exchange between processes, using virtual circuits known as
               streams. To this presentation layer, CAOS adds (1) a flexible
               process scheduler, and (2) an object-centered notion of
               agents, dynamically-instantiable entities which model
               interpreted signal features.
               This report documents the principal ideas, programming model,
               and implementation of CAOS. A model of real-time signal
               interpretation, based on replicated "abstraction" pipelines,
               is presented. For some applications, this model offers a
               means by which large numbers of processors may be utilized
               without introducing synchronization-necessitated software
               bottlenecks.
               The report concludes with a description of the performance of
               a large CAOS application over various sizes of multiprocessor
               configurations. Lessons about problem decomposition grain
               size, global problem solving control strategy, and
               appropriate service provided to CAOS by the underlying
               architecture are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1125/CS-TR-86-1125.pdf

%R CS-TR-86-1126
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T CAREL: A Visible Distributed Lisp
%A Davies, Byron
%D March 1986
%X CAREL is a Lisp implementation designed to be a high-level
               interactive systems programming language for a
               distributed-memory multiprocessor. CAREL insulates the user
               from the machine language of the multiprocessor architecture,
               but still makes it possible for the user to specify
               explicitly the assignment of tasks to processors in the
               multiprocessor network. CAREL has been implemented to run on
               a TI Explorer Lisp machine using Stanford's CARE
               multiprocessor simulator.
               CAREL is more than a language: real-time graphical displays
               provided by the CARE simulator make CAREL a novel graphical
               programming environment for distributed computing. CAREL
               enables the user to create programs interactively and then
               watch them run on a network of simulated processors. As a
               CAREL program executes, the CARE simulator graphically
               displays the activity of the processors and the transmission
               of data through the network. Using this capability, CAREL has
               demonstrated its utility as an educational tool for
               multiprocessor computing.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1126/CS-TR-86-1126.pdf

%R CS-TR-86-1129
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Beta Operations: Efficient Implementation of a Primitive
               Parallel Operation
%A Cohn, Evan R.
%A Haddad, Ramsey W.
%D August 1986
%X We will consider the primitive parallel operation of the
               Connection Machine, the Beta Operation. Let the imput size of
               the problem be N and output size M. We will show how to
               perforrn the Beta Operation on an N-node hypercube in O(log N
               + $log^2$ M) time. For a $\sqrt{N} x \sqrt{M}$ mesh-of-trees,
               we require O(log N + $\sqrt{M}$) time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1129/CS-TR-86-1129.pdf

%R CS-TR-86-1131
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Processor Renaming in Asynchronous Environments
%A Bar-Noy, Amotz
%A Peleg, David
%D September 1986
%X Fischer, Lynch and Paterson proved that in a completely
               asynchronous system "weak agreement" cannot be achieved even
               in the presence of a single "benign" fault. Following the
               direction proposed in Attiya, Bar-Noy, Dolev and Koller (Aug
               1986), we demonstrate the interesting fact that some weaker
               forms of processor cooperation are still achievable in such a
               situation, and in fact, even in the presence of up to t < n/2
               such faulty processors. In particular, we show that n
               processors, each having a distinct name taken from an
               unbounded ordered domain, can individually choose new
               distinct names from a space of size n + t (where n is an
               obvious lower bound). In case the new names are required also
               to preserve the original order, we give an algorithm in which
               the space of new names is of size ${2^t}(n - t + 1) - 1$, which
               is tight.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1131/CS-TR-86-1131.pdf

%R CS-TR-86-1132
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Optimizing Datalog Programs
%A Sagiv, Yehoshua
%D March 1986
%X Datalog programs, i.e., Prolog programs without function
               symbols, are considered. It is assumed that a variable
               appearing in the head of a rule must also appear in the body
               of the rule. The input of a program is a set of ground atoms
               (which are given in addition to the program's rules) and,
               therefore, can be viewed as an assignment of relations to
               some of the program's predicates. Two programs are equivalent
               if they produce the same result for all possible assignments
               of relations to the extensional predicates (i.e., the
               predicates that do not appear as heads of rules). Two
               programs are uniformly equivalent if they produce the same
               result for all possible assignments of initial relations to
               all the predicates (i.e. both extensional and intentional).
               The equivalence problem for Datalog programs is known to be
               undecidable. It is shown that uniform equivalence is
               decidable, and an algorithm is given for minimizing a Datalog
               program under equivalence. A technique for removing parts of
               a program that are redundant under equivalence (but not under
               uniform equivalence) is developed. A procedure for testing
               uniform equivalence is also developed for the case in which
               the database satisfies some constraints.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1132/CS-TR-86-1132.pdf

%R CS-TR-86-1134
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T UIO: A Uniform I/O System Interface for Distributed Systems
%A Cheriton, David R.
%D November 1986
%X A uniforrn I/O interface allows programs to be written
               relatively independent of specific I/O services and yet work
               with a wide variety of the I/O services available in a
               distributed environment. Ideally, the interface provides this
               uniform access without excessive complexity in the interface
               or loss of performance. However, a uniform interface does not
               arise from careful design of individual system interfaces
               alone; it requires explicit definition.
               In this paper, we describe the UIO (uniform I/O) system
               interface that has been used for the past five years in the V
               distributed operating systems, focusing on the key design
               issues. This interface provides several extensions beyond the
               I/O interface of UNIX, including support for record I/O,
               locking, atomic transactions and replications as well as
               attributes that indicate whether optional semantics and
               operations are available. We also describe our experience in
               using and implementing this interface with a variety of
               different I/O services plus the performance of both local and
               network I/O. We conclude that the UIO interface provides a
               uniform I/O system interface with significant functionality,
               wide applicability and no significant performance penalty.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1134/CS-TR-86-1134.pdf

%R CS-TR-86-1136
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Experiment in Knowledge-based Signal Understanding Using
               Parallel Architectures
%A Brown, Harold
%A Schoen, Eric
%A Delagi, Bruce
%D October 1986
%X This report documents an experiment investigating the
               potential of a parallel computing architecture to enhance the
               performance of a knowledge-based signal understanding system.
               The experiment consisted of implementing and evaluating an
               application encoded in a parallel programming extension of
               Lisp and executing on a simulated multiprocessor system.
               The chosen application for the experiment was a
               knowledge-based system for interpreting pre-processed,
               passively acquired radar emissions from aircraft. The
               application was implemented in an experimental concurrent,
               asynchronous object-oriented framework. This framework, in
               turn, relied on the services provided by the underlying
               hardware system. The hardware system for the experiment was a
               simulation of various sized grids of processors with
               inter-processor communication via message-passing.
               The experiment investigated the effects of various high-level
               control strategies on the quality of the problem solution,
               the speedup of the overall system performance as a function
               of the number of processors in the grid, and some of the
               issues in implementing and debugging a knowledge-based system
               on a message-passing multiprocessor system.
               In this report we describe the software and (simulated)
               hardware components of the experiment and present the
               qualitative and quantitative experimental results.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1136/CS-TR-86-1136.pdf

%R CS-TR-86-1137
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Leaf File Access Protocol
%A Mogul, Jeffrey
%D December 1986
%X Personal computers are superior to timesharing systems in
               many ways, but they are inferior in this respect: they make
               it harder for users to share files. A local area network
               provides a substrate upon which file sharing can be built;
               one must also have a protocol for sharing files. This report
               describes Leaf, one of the first protocols to allow remote
               access to files.
               Leaf is a remote file access protocol rather than a file
               transfer protocol. Unlike a file transfer protocol, which
               must create a complete copy of a file, a file access protocol
               provides random access directly to the file itself. This
               promotes sharing because it allows simultaneous access to a
               file by several remote users, and because it avoids the
               creation of new copies and the associated
               consistency-maintenance problem.
               The protocol described in this report is nearly obsolete. It
               is interesting for historical reasons, primarily because it
               was perhaps the first non-proprietary remote file access
               protocol actually implemented, and also because it serves as
               a case study in practical protocol design.
               The specification of Leaf is included as an appendix; it has
               not been widely available outside of Stanford.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1137/CS-TR-86-1137.pdf

%R CS-TR-86-1139
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Local Shape from Specularity
%A Healey, Glenn
%A Binford, Thomas O.
%D June 1986
%X We show that highlights in images of objects with specularly
               reflecting surfaces provide significant information about the
               surfaces which generate them. A brief survey is given of
               specular reflectance models which have been used in computer
               vision and graphics. For our work, we adopt the
               Torrance-Sparrow specular model which, unlike most previous
               models, considers the underlying physics of specular
               reflection from rough surfaces. From this model we derive
               powerful relationships between the properties of a specular
               feature in an image and local properties of the corresponding
               surface. We show how this analysis can be used for both
               prediction and interpretation in a vision system. A shape
               from specularity system has been implemented to test our
               approach. The performance of the system is demonstrated by
               careful experiments with specularly reflecting objects.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1139/CS-TR-86-1139.pdf

%R CS-TR-86-1130
%Z Mon, 01 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On Detecting Edges
%A Nalwa, Vishvjit S.
%A Binford, Thomas O.
%D March 1986
%X An edge in an image corresponds to a discontinuity in the
               intensity surface of the underlying scene. It can be
               approximated by a piecewise straight curve composed of
               edgels, i.e., short, linear edge-elements, each characterized
               by a direction and a position. The approach to
               edgel-detection here, is to fit a series of one-dimensional
               surfaces to each window (kernel of the operator) and accept
               the surface-description which is adequate in the least
               squares sense and has the fewest parameters. (A
               one-dimensional surface is one which is constant along some
               direction.) The tanh is an adequate basis for the step-edge
               and its combinations are adequate for the roof-edge and the
               line-edge.
               The proposed method of step-edgel detection is robust with
               respect to noise; for (step-size/${\sigma}_{noise}$) >= 2.5,
               it has subpixel position localization (${\sigma}_{position}$
               < 1/3) and an angular localization better than $10^\infty$;
               further, it is designed to be insensitive to smooth shading.
               These results are demonstrated by some simple analysis,
               statistical data and edgel-images. Also included is a
               comparison, of performance on a real image, with a typical
               operator (Difference-of-Gaussians). The results indicate that
               the proposed operator is superior with respect to detection,
               localization and resolution.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/86/1130/CS-TR-86-1130.pdf

%R CS-TR-84-1004
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A computational theory of higher brain function
%A Goldschlager, Leslie M.
%D April 1984
%X The higher functions of the brain are believed to occur in
               the cortex. This region of the brain is modelled as a memory
               surface which performs both storage and computation. Concepts
               are modelled as patterns of activity on the memory surface,
               and the model explains how these patterns interact with one
               another to give the computations which the brain performs.
               The method of interaction can explain the formation of
               abstract concepts, association of ideas and train of thought.
               It is shown that creativity, self, consciousness and free
               will are explainable within the same framework. A theory of
               sleep is presented which is consistent with the model.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1004/CS-TR-84-1004.pdf

%R CS-TR-84-1005
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Adequate proof principles for invariance and liveness
               properties of concurrent programs
%A Manna, Z ohar
%A Pnueli, Amir
%D May 1984
%X This paper presents proof principles for establishing
               invariance and liveness properties of concurrent programs.
               Invariance properties are established by systematically
               checking that they are preserved by every atomic instruction
               in the program. The methods for establishing liveness
               properties are based on 'well-founded asserations' and are
               applicable to both "just" and "fair" computations. These
               methods do not assume a decrease of the rank at each
               computation step. It is sufficient that there exists one
               process which decreases the rank when activated. Fairness
               then ensures that the program will eventually attain its
               goal. In the finite state case such proofs can be represented
               by diagrams. Several examples are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1005/CS-TR-84-1005.pdf

%R CS-TR-84-1006
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T EKL - an interactive proof checker user's reference manual
%A Ketonen, Jussi
%A Weening, Joseph S.
%D June 1984
%X EKL is an interactive proof checker and constructor. Its main
               goal is to facilitate the checking of mathematical proofs.
               Some of the special features of EKL are:
               * The language of EKL can be extended all the way to
               finite-order predicate logic with typed lambda-calculus.
               * Several proofs can be handled at the same time.
               * Metatheoretic reasoning allows formal extensions of the
               capabilities of EKL.
               * EKL is a programmable system. The MACLISP language is
               available to the user, and LISP functions can be written to
               create input to EKL, thereby allowing expression of proofs in
               an arbitrary input language.
               This document is a reference manual for EKL. Each of the
               sections discusses a major part of the language, beginning
               with an overview of that area, and proceeding to a detailed
               discussion of available features. To gain an acquaintance
               with EKL, it is recommended that you read only the
               introductory part of each section.
               EKL may be used both at the Stanford Artificial Intelligence
               Laboratory (SAIL) computer system, and on DEC TOPS-20 systems
               that support MACLISP.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1006/CS-TR-84-1006.pdf

%R CS-TR-84-1007
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Queue-based multi-processing Lisp
%A Gabriel, Richard P.
%A McCarthy, John
%D June 1984
%X This report presents a dialect of Lisp, called QLAMBDA, which
               supports multi-processing. Along with the definition of the
               dialect, the report presents programming examples and
               performance studies of some programs written in QLAMBDA.
               Unlike other proposed multi-processing Lisps, QLAMBDA
               provides only a few very powerful and intuitive primitives
               rather than a number of parallel variants of familiar
               constructs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1007/CS-TR-84-1007.pdf

%R CS-TR-84-1009
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Complexity of a top-down capture rule
%A Sagiv, Yehoshua
%A Ullman, Jeffrey D.
%D July 1984
%X Capture rules were introduced in [U] as a method for planning
               the evaluation of a query expressed in first-order logic. We
               examine a capture rule that is substantiated by a simple
               top-down implementation of restricted Horn clause logic. A
               necessary and sufficient condition for the top-down algorithm
               to converge is shown. It is proved that, provided there is a
               bound on the number of arguments of predicates, the test can
               be performed in polynomial time; however, if the arity of
               predicates is made part of the input, then the problem of
               deciding whether the top-down algorithm converges is NP-hard.
               We then consider relaxation of some of our constraints on the
               form of the logic, showing that success of the top-down
               algorithm can still be tested in polynomial time if the
               number of arguments is limited and in exponential time if
               not.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1009/CS-TR-84-1009.pdf

%R CS-TR-84-1012
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T TABLOG: the deductive-tableau programming language
%A Malachi, Yonathan
%A Manna, Z ohar
%A Waldinger, Richard
%D June 1984
%X TABLOG (Tableau Logic Programming Language) is a language
               based on first-order predicate logic with equality that
               combines functional and logic programming. TABLOG
               incorporates advantages of LISP and PROLOG.
               A program in TABLOG is a list of formulas in a first-order
               logic (including equality, negation, and equivalence) that is
               more general and more expressive than PROLOG's Horn clauses.
               Whereas PROLOG programs must be relational, TABLOG programs
               may define either relations or functions. While LISP programs
               yield results of a computation by returning a single output
               value, TABLOG programs can be relations and can produce
               several results simultaneously through their arguments.
               TABLOG employs the Manna-Waldinger deductive-tableau proof
               system as an interpreter in the same way that PROLOG uses a
               resolution-based proof system. Unification is used by TABLOG
               to match a call with a line in the program and to bind
               arguments. The basic rules of deduction used for computing
               are nonclausal resolution and rewriting by means of equality
               and equivalence.
               A pilot interpreter for the language has been implemented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1012/CS-TR-84-1012.pdf

%R CS-TR-84-1014
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A P-complete problem and approximations to it
%A Anderson, Richard
%A Mayr, Ernst W.
%D September 1984
%X The P-complete problem that we will consider is the High
               Degree Subgraph Problem. This problem is: given a graph G =
               (V,E) and an integer k, find the maximum induced subgraph of
               G that has all nodes of degree at least k. After showing that
               this problem is P-complete, we will discuss two approaches to
               finding approximate solutions to it in NC. We will give a
               variant of the problem that is also P-complete that can be
               approximated to within a factor of c in NC, for any c < 1/2,
               but cannot be approximated by a factor of better than 1/2
               unless P = NC. We will also give an algorithm that finds a
               subgraph with moderately high minimum degree. This algorithm
               exhibits an interesting relationship between its performance
               and the time it takes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1014/CS-TR-84-1014.pdf

%R CS-TR-84-1018
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Classification problem solving
%A Clancey, William J.
%D July 1984
%X A broad range of heuristic programs--embracing forms of
               diagnosis. catalog selection, and skeletal
               planning--accomplish a kind of well-structured problem
               solving called classification. These programs have a
               characteristic inference structure that systematically
               relates data to a pre-enumerated set of solutions by
               abstraction, heuristic association, and refinement. This
               level of description specifies the knowledge needed to solve
               a problem, independent of its representation in a particular
               computer language. The classification problem-solving model
               provides a useful framework for recognizing and representing
               similar problems, for designing representation tools, and for
               understanding the problem-solving methods used by
               non-classification programs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1018/CS-TR-84-1018.pdf

%R CS-TR-84-1023
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A method for managing evidential reasoning in a hierarchical
               hypothesis space
%A Gordon, Jean
%A Shortliffe, Edward H.
%D September 1984
%X No abstract.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1023/CS-TR-84-1023.pdf

%R CS-TR-84-1024
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T How to share memory in a distributed system
%A Upfal, Eli
%A Wigderson, Avi
%D October 1984
%X We study the power of shared-memory in models of parallel
               computation. We describe a novel distributed data structure
               that eliminates the need for shared memory without
               significantly increasing the run time of the parallel
               computation. More specifically we show how a complete network
               of processors can deterministically simulate one PRAM step in
               O(log n ${(loglog n)}^2$) time, when both models use n
               processors, and the size of the PRAM's shared memory is
               polynomial in n. (The best previously known upper bound was
               the trivial O(n)). We also establish that this upper bound is
               nearly optimal. We prove that an on-line simulation of T PRAM
               steps by a complete network of processors requires $\Omega (T
               log n/loglog n)$ time.
               A simple consequence of the upper bound is that an
               Ultracomputer (the only currently feasible general purpose
               parallel machine), can simulate one step of a PRAM (the most
               convenient parallel model to program), in O(${(log n loglog
               n)}^2$) steps.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1024/CS-TR-84-1024.pdf

%R CS-TR-84-1025
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast scheduling algorithms on parallel computers
%A Helmbold, David
%A Mayr, Ernst
%D November 1984
%X With the introduction of parallel processing, scheduling
               problems have generated great interest. Although there are
               good sequential algorithms for many scheduling problems,
               there are few fast parallel scheduling algorithms. In this
               paper we present several good scheduling algorithms that run
               on EREW PRAMS. For the unit time execution case, we have
               algorithms that will schedule n jobs with intree or outtree
               precedence constraints in O(log n) time. The intree algorithm
               requires $n^3$ processors, and the outtree algorithm requires
               $n^4$ processors.
               Another type of scheduling problem is list scheduling, where
               a list of n jobs with integer execution times is to be
               scheduled in list order. We show that the general list
               scheduling problem on two identical processors is
               polynomial-time complete, and therefore is not likely to have
               a fast parallel algorithm. However, when the length of the
               (binary representation of the) execution times is bounded by
               O($log^c$ n) there is an NC algorithm using $n^4$ processors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1025/CS-TR-84-1025.pdf

%R CS-TR-84-1027
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A torture test for TEX
%A Knuth, Donald E.
%D November 1984
%X Programs that claim to be implementations of TEX82 are
               supposed to be able to process the test routine contained in
               this report, producing the outputs contained in this report.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1027/CS-TR-84-1027.pdf

%R CS-TR-84-1028
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel graph algorithms
%A Hochschild, Peter H.
%A Mayr, Ernst W.
%A Siegel, Alan R.
%D December 1984
%X This paper presents new paradigms to solve efficiently a
               variety of graph problems on parallel machines. These
               paradigms make it possible to discover and exploit the
               "parallelism" inherent in many classical graph problems. We
               abandon attempts to force sequential algorithms into parallel
               environments for such attempts usually result in transforming
               a good uniprocessor algorithm into a hopelessly greedy
               parallel algorithm. We show that by employing more local
               computation and mild redundance, a variety of problems can be
               solved in a resource- and time-efficient manner on a variety
               of architectures.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1028/CS-TR-84-1028.pdf

%R CS-TR-84-1032
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Solving the Prisoner's Dilemma
%A Genesereth, Michael R.
%A Ginsberg, Matthew L.
%A Rosenschein, Jeffrey S.
%D November 1984
%X A framework is proposed for analyzing various types of
               rational interaction. We consider a variety of restrictions
               of participants' moves; each leads to a diferent
               characterization of rational behavior. Under an assumption of
               "common rationality," it is proven that participants will
               cooperate, rather than defect, in the Prisoner's Dilemma.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1032/CS-TR-84-1032.pdf

%R CS-TR-84-1034
%Z Sat, 27 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T BB1: an architecture for blackboard systems that control,
               explain, and learn about their own behavior
%A Hayes-Roth, Barbara
%D December 1984
%X BB1 implements a domain-independent "blackboard control
               architecture" for Al systems that control, explain, and learn
               about their own problem-solving behavior. A BB1 system
               comprises: a user-defined domain blackboard, a pre-defined
               control blackboard, user-defined domain and control knowledge
               sources, a few generic control knowledge sources, and a
               pre-defined basic control loop. The architecture's run time
               user interface provides capabilities for: displaying the
               blackboard, knowledge sources, and pending knowledge source
               actions, recommending an action for execution, explaining a
               recommendation, accepting a user's override, executing a
               designated action, and running without user intervention. BB1
               supports a variety of control behavior ranging from execution
               of pre-defined control procedures to dynamic construction and
               modification of complex control plans during problem solving.
               It explains problem-solving actions by showing their roles in
               the underlying control plan. It learns new control heuristics
               from experience, applies them within the current
               problem-solving session, and uses them to construct new
               control plans in subsequent sessions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1034/CS-TR-84-1034.pdf

%R CS-TR-84-1003
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallelism and greedy algorithms
%A Anderson, Richard
%A Mayr, Ernst
%D April 1984
%X A number of greedy algorithms are examined and are shown to
               be probably inherently sequential. Greedy algorithms are
               presented for finding a maximal path, for finding a maximal
               set of disjoint paths in a layered dag, and for finding the
               largest induced subgraph of a graph that has all vertices of
               degree at least k. It is shown that for all of these
               algorithms, the problem of determining if a given node is in
               the solution set of the algorithm is P-complete. This means
               that it is unlikely that these sequential algorithms can be
               sped up significantly using parallelism.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/84/1003/CS-TR-84-1003.pdf

%R CS-TR-83-962
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Bibliography of Stanford Computer Science reports, 1963-1983
%A Berg, Kathryn A.
%D March 1983
%X This report lists, in chronological order, all reports
               published by the Stanford Computer Science Department since
               1963. Each report is identified by a Computer Science number,
               author's name, title, National Technical Information Service
               (NTIS) retrieval number (i.e., AD-XXXXXX), date, and number
               of pages. If an NTIS number is not given, it means that the
               report is probably not available from NTIS.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/962/CS-TR-83-962.pdf

%R CS-TR-83-963
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A hardware semantics based on temporal intervals
%A Halpern, Joseph
%A Manna, Z ohar
%A Moszkowski, Ben
%D March 1983
%X We present an interval-based temporal logic that permits the
               rigorous specification of a variety of hardware components
               and facilitates describing properties such as correctness of
               implementation. Conceptual levels of circuit operation
               ranging from detailed quantitative timing and signal
               propagation up to functional behavior are integrated in a
               unified way.
               After giving some motivation for reasoning about hardware, we
               present the propositional and first-order syntax and
               semantics of the temporal logic. In addition we illustrate
               techniques for describing signal transitions as well as for
               formally specifying and comparing a number of delay models.
               Throughout the discussion, the formalism provides a means for
               examining such concepts as device equivalence and internal
               states.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/963/CS-TR-83-963.pdf

%R CS-TR-83-964
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Proving precedence properties: the temporal way
%A Manna, Z ohar
%A Pnueli, Amir
%D April 1983
%X This paper explores the three important classes of temporal
               properties of concurrent programs: invariance, liveness and
               precedence. It presents the first methodological approach to
               the precedence properties, while providing a review of the
               invariance and liveness properties. The approach is based on
               the 'unless' operator, which is a weak version of the 'until'
               operator. For each class of properties, we present a single
               complete proof principle. Finally, we show that the
               properties of each class are decidable over finite state
               programs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/964/CS-TR-83-964.pdf

%R CS-TR-83-965
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An approach to type design and text composition in Indian
               scripts
%A Ghosh, Pijush K.
%D April 1983
%X The knowledge of letters exerts a dual enchantment. When it
               uncovers the relationships between a series of arbitrary
               symbols and the sounds of speech, it fills us with joy. For
               others the visible expression of the letters, their graphical
               forms, their history and their development become
               fascinating. The advent of digital information technology has
               opened new vistas in the concept of letter forms.
               Unfortunately the graphics industry in India has remained
               almost unaffected by these technological advances, especially
               in the field of type design and text composition. This report
               strives to demonstrate how to use various tools and
               techniques, so that the new technology can cope with the
               plurality of Indian scripts. To start with all you need to
               know is the basic shapes of the letters of the Roman alphabet
               and the sounds they represent. With this slender thread of
               knowledge an enjoyable study of letter design and text
               composition in Indian scripts can begin.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/965/CS-TR-83-965.pdf

%R CS-TR-83-966
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A formal approach to lettershape description for type design
%A Ghosh, Pijush K.
%A Bigelow, Charles A.
%D May 1983
%X This report is designed to explore some analytic means of
               specifying lettershapes. Computer representation and analysis
               of lettershape have made use of two diametrically different
               approaches, one representing a shape by its boundary, the
               other by its skeleton or medial axis. Generally speaking, the
               boundary representation is conceptually simpler to the
               designer, but the skeletal representation provides more
               insight into the "piecedness" of the shape. Donald Knuth's
               METAFONT is one of the sophisticated lettering design systems
               which has basically adopted the medial axis approach.
               Moreover, the METAFONT system has introduced the idea of
               metafont-description of a letter, i.e., to give a rigorous
               definition of the shape of a letter in such a way that many
               styles are obtained from a single definition by changing only
               a few user-defined parameters. That is why we have considered
               the METAFONT system as our starting point and have shown how
               we can arrive at the definition of a formal language for
               specifying lettershapes. We have also introduced a simple
               mathematical model for decomposing a letter into its
               constituent elements.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/966/CS-TR-83-966.pdf

%R CS-TR-83-967
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Verification of concurrent programs: a temporal proof system
%A Manna, Z ohar
%A Pnueli, Amir
%D June 1983
%X A proof system based on temporal logic is presented for
               proving properties of concurrent programs based on the
               shared-variables computation model. The system consists of
               three parts: the general uninterpreted part, the domain
               dependent part and the program dependent part. In the general
               part we give a complete proof system for first-order temporal
               logic with detailed proofs of useful theorems. This logic
               enables reasoning about general time sequences. The domain
               dependent part characterizes the special properties of the
               domain over which the program operates. The program dependent
               part introduces program axioms which restrict the time
               sequences considered to be execution sequences of a given
               program.
               The utility of the full system is demonstrated by proving
               invariance, liveness and precedence properties of several
               concurrent programs. Derived proof principles for these
               classes of properties are obtained and lead to a compact
               representation of proofs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/967/CS-TR-83-967.pdf

%R CS-TR-83-969
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reasoning in interval temporal logic
%A Moszkowski, Ben
%A Manna, Z ohar
%D July 1983
%X Predicate logic is a powerful and general descriptive
               formalism with a long history of development. However, since
               the logic's underlying semantics have no notion of time,
               statements such as "I increases by 2" cannot be directly
               expressed. We discuss interval temporal logic (ITL), a
               formalism that augments standard predicate logic with
               operators for time-dependent concepts. Our earlier work used
               ITL to specify and reason about hardware. In this paper we
               show how ITL can also directly capture various control
               structures found in conventional programming languages.
               Constructs are given for treating assignment, iteration,
               sequential and parallel computations and scoping. The
               techniques used permit specification and reasoning about such
               algorithms as concurrent Quicksort. We compare ITL with the
               logic-based programming languages Lucid and Prolog.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/969/CS-TR-83-969.pdf

%R CS-TR-83-971
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Letterform design systems
%A Ruggles, Lynn
%D April 1983
%X The design of letterforms requires a skilled hand, an eye for
               fine detail and an understanding of the letterforms
               themselves. This work has traditionally been done by
               experienced artisans, but in the last fifteen years there
               have been attempts to integrate the design process with the
               use of computers in order to create digital type forms. The
               use of design systems for the creation of these digital forms
               has led to an analysis of the way type designs are created by
               type designers. Their methods have been integrated into a
               variety of systems for creating digital forms. This paper
               describes these design systems and discusses the relevant
               issues for the success of the systems that exist and are used
               today.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/971/CS-TR-83-971.pdf

%R CS-TR-83-972
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Experience with a regular expression compiler
%A Karlin, Anna R.
%A Trickey, Howard W.
%A Ullman, Jeffrey D.
%D June 1983
%X The language of regular expressions is a useful one for
               specifying certain sequebtial processes at a very high level.
               They allow easy modification of designs for circuits, like
               controllers, that are described by patterns of events they
               must recognize and the responses they must make to those
               patterns. This paper discusses the compilation of such
               expressions into reasonably compact layouts. The translation
               of regular expressions into nondeterministic automata by two
               different methods is discussed, along with the advantages of
               each method. A major part of the compilation problem is
               selection of good state codes for the nondeterministic
               automata; one successful strategy is explained in the paper.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/972/CS-TR-83-972.pdf

%R CS-TR-83-973
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The distributed V kernel and its performance for diskless
               workstations
%A Cheriton, David R.
%A Z waenepoel, Willy
%D July 1983
%X The distributed V kernel is a message-oriented kernel that
               provides uniform local and network interprocess
               communication. It is primarily being used in an environment
               of diskless workstations connected by a high-speed local
               network to a set of file servers. We describe a performance
               evaluation of the kernel, with particular emphasis on the
               cost of network file access. Our results show that over a
               local network:
               1. Diskless workstations can access remole files with minimal
               performance penalty.
               2. The V message facility can be used to access remote files
               at comparable cost to any well-tuned specialized file access
               protocol.
               We conclude that it is feasible to build a distributed system
               with all network communication using the V message facility
               even when most of the network nodes have no secondary
               storage.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/973/CS-TR-83-973.pdf

%R CS-TR-83-974
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Chinese Meta-Font
%A Hobby, John
%A Guoan, Gu
%D July 1983
%X METAFONT is Donald E. Knuth's system for alphabet design. The
               system allows an entire family of fonts or "meta-fonts" to be
               specified precisely and mathematically so that it can be
               produced in different sizes and styles for different raster
               devices.
               We present a new technique for defining Chinese characters
               hierarchically with METAFONT. We define METAFONT subroutines
               for commonly used portions of strokes and then combine some
               of these into routines for drawing complete strokes.
               Parameters describe the skeletons of the strokes and the
               stroke routines are carefully designed to transform
               themselves appropriately. This allows us to handle all of the
               basic strokes with only 14 different routines.
               The stroke routines in turn are used to build up groups of
               strokes and radicals. Special routines for positioning
               control points ensure that the strokes will join properly in
               a variety of different styles. The radical routines are
               parameterized to allow them to be placed at different
               locations in the typeface and to allow for adjusting their
               size and shape. Key points are positioned relative to the
               bounding box for the radical, and the special positioning
               routines find other points that must be passed to the stroke
               routines.
               We use this method to design high quality Song style
               characters. Global parameters control the style, and we show
               how these can be used to create Song and Long Song from the
               same designs. Other settings can produce other familiar
               styles or even new styles. We show how it is possible to
               create completely different styles, such as Bold style,
               merely by substituting different stroke routines. The global
               parameters can be used to augment simple scaling by altering
               stroke width and other details to account for changes in
               size. We can adjust stroke widths to help even out the
               overall darkness of the characters. We also show how it is
               possible to experiment with new ideas such as adjusting
               character widths individually.
               While many of our characters are based on existing designs,
               the stroke routines facilitate the design of new characters
               without the need to refer to detailed drawings. The skeletal
               parameters and special positioning routines make it easy to
               position the strokes properly. In our previous paper, in
               contrast to this, we parameterized the strokes according to
               their boundaries and copied an existing design. The previous
               approach made it very difficult to create different styles
               with the same METAFONT program.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/974/CS-TR-83-974.pdf

%R CS-TR-83-980
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The WEB system of structured documentation
%A Knuth, Donald E.
%D September 1983
%X This memo describes how to write programs in the WEB language
               (Version 2.3, September 1983); and it also includes the full
               WEB documentation for WEAVE and TANGLE, the programs that
               read WEB input and produce TEX and PASCAL output,
               respectively.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/980/CS-TR-83-980.pdf

%R CS-TR-83-985
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T First grade TEX: a beginner's TEX manual
%A Samuel, Arthur L.
%D November 1983
%X This is an introductory ready-reference TEX82 manual for the
               beginner who would like to do First Grade TEX work. Only the
               most basic features of the TEX system are discussed in
               detail. Other features are summarized in an appendix and
               references are given to the more complete documentation
               available elsewhere.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/985/CS-TR-83-985.pdf

%R CS-TR-83-989
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming and problem-solving seminar
%A Knuth, Donald E.
%A Weening, Joseph S.
%D December 1983
%X This report contains edited transcripts of the discussions
               held in Stanford's course CS 204, Problem Seminar, during
               autumn quarter 1981. Since the topics span a large range of
               ideas in computer science, and since most of the important
               research paradigms and programming paradigms were touched on
               during the discussions, these notes may be of interest to
               graduate students of computer science at other universities,
               as well as to their professors and to professional people in
               the "real world."
               The present report is the fourth in a series of such
               transcripts, continuing the tradition established in CS606
               (Michael J. Clancy, 1977), CS707 (Chris Van Wyk, 1979), and
               CS863 (Allan A. Miller, 1981).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/989/CS-TR-83-989.pdf

%R CS-TR-83-990
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming and problem-solving seminar
%A Hobby, John D.
%A Knuth, Donald E.
%D December 1983
%X This report contains edited transcripts of the discussions
               held in Stanford's course CS204, Problem Seminar, during
               autumn quarter 1982. Since the topics span a large range of
               ideas in computer science, and since most of the important
               research paradigms and programming paradigms were touched on
               during the discussions, these notes may be of interest to
               graduate students of computer science at other universities,
               as well as to their professors and to professional people in
               the "real world."
               The present report is the fifth in a series of such
               transcripts, continuing the tradition established in
               STAN-CS-77-606 (Michael J. Clancy, 1977), STAN-CS-79-707
               (Chris Van Wyk, 1979), STAN-CS-81-863 (Allan A. Miller,
               1981), STAN-CS-83-989 (Joseph S. Weening, 1983).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/990/CS-TR-83-990.pdf

%R CS-TR-83-991
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel algorithms for arithmetics, irreducibility and
               factoring of GFq-polynomials
%A Morgensteren, Moshe
%A Shamir, Eli
%D December 1983
%X A new algorithm for testing irreducibility of polynomials
               over finite fields without gcd computations makes it possible
               to devise efficient parallel algorithms for polynomial
               factorization. We also study the probability that a random
               polynomial over a finite field has no factors of small
               degree.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/991/CS-TR-83-991.pdf

%R CS-TR-83-992
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The language of an interactive proof checker
%A Ketonen, Jussi
%A Weening, Joseph S.
%D December 1983
%X We describe the underlying language for EKL, an interactive
               theorem-proving system currently under development at the
               Stanford Artificial Intelligence Laboratory. Some of the
               reasons for its development as well as its mathematical
               properties are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/992/CS-TR-83-992.pdf

%R CS-TR-83-994
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sorting by recursive partitioning
%A Chapiro, Daniel M.
%D December 1983
%X We present a new O(n lg lg n) time sort algorithm that is
               more robust than O(n) distribution sorting algorithms. The
               algorithm uses a recursive partition-concatenate approach,
               partitioning each set into a variable number of subsets using
               information gathered dynamically during execution. Sequences
               are partitioned using statistical information computed during
               the sort for each sequence.
               Space complexity is O(n) and is independent from the order
               and distributlon of the data. lf the data is originally in a
               list, only O($\sqrt{n}$) extra space is necessary.
               The algorithm is insensitive to the initial ordering of the
               data, and it is much less sensitive to the distribution of
               the values of the sorting keys than distribution sorting
               algorithms. Its worst-case time is O(n lg lg n) across all
               distributions that satisfy a new "fractalness" criterion.
               This condition, which is sufficient but not necessary, is
               satisfied by any set with bounded length keys and bounded
               repetition of each key.
               If this condition is not satisfied, its worst case
               performance degrades gracefully to O(n lg n). In practice,
               this occurs when the density of the distribution over
               $\Omega$(n) of the keys is a fractal curve (for sets of
               numbers whose values are bounded), or when the distribution
               has very heavy tails with arbitrarily long keys (for sets of
               numbers whose precision is bounded).
               In some preliminary tests, it was faster than Quicksort for
               sets of more than 150 elements. The algorithm is practical,
               works basically "in place", can be easily implemented and is
               particularly well suited both for parallel processing and for
               external sorting.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/994/CS-TR-83-994.pdf

%R CS-TR-83-995
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The advantages of abstract control knowledge in expert system
               design
%A Clancey, William J.
%D November 1983
%X A poorly designed knowledge base can be as cryptic as an
               arbitrary program and just as difficult to maintain.
               Representing control knowledge abstractly, separately from
               domain facts and relations, makes the design more transparent
               and explainable. A body of abstract control knowledge
               provides a generic framework for constructing knowledge bases
               for related problems in other domains and also provides a
               useful starting point for studying the nature of strategies.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/995/CS-TR-83-995.pdf

%R CS-TR-83-996
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Strategic explanations for a diagnostic consultation system
%A Hasling, Diane Warner
%A Clancey, William J.
%A Rennels, Glenn
%D November 1983
%X This paper examines the problem of automatic explanation of
               reasoning, especially as it relates to expert systems. By
               explanation we mean the ability of a program to discuss what
               it is doing in some understandable way. We first present a
               general framework in which to view explanation and review
               some of the research done in this area. We then focus on the
               explanation system for NEOMYCIN, a medical consultation
               program. A consultation program interactively helps a user to
               solve a problem. Our goal is to have NEOMYCIN explain its
               problem-solving strategies. An explanation of strategy
               describes the plan the program is using to reach a solution.
               Such an explanation is usually concrete, referring to aspects
               of the current problem situation. Abstract explanations
               articulate a general principle, which can be applied in
               different situations; such explanations are useful in
               teaching and in explaining by analogy. We describe the
               aspects of NEOMYCIN that make abstract strategic explanations
               possible--the representation of strategic knowledge
               explicitly and separately from domain knowledge--and
               demonstrate how this representation can be used to generate
               explanations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/996/CS-TR-83-996.pdf

%R CS-TR-83-945
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Perseus: retrospective on a portable operating system
%A Z waenepoel, Willy
%A Lantz, Keith A.
%D February 1983
%X We describe the operating system Perseus, developed as part
               of a study into the issues of computer communications and
               their impact on operating system and programming language
               design. Perseus was designed to be portable by virtue of its
               kernel-based structure and its implementation in Pascal. In
               particular, machine-dependent code is limited to the kernel
               and most operating systems functions are provided by server
               processes, running in user mode. Perseus was designed to
               evolve into a distributed operating system by virtue of its
               interprocess communication facilities, based on
               message-passing. This paper presents an overview of the
               system and gives an assessment of how far it satisfied its
               original goals. Specifically, we evaluate its interprocess
               communication facilities and kernel-based structure, followed
               by a discussion of portability. We close with a brief history
               of the project, pointing out major milestones and stumbling
               blocks along the way.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/83/945/CS-TR-83-945.pdf

%R CS-TR-82-998
%Z Mon, 29 May 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Knowledge engineering: a daily activity on a hospital ward
%A Mulsant, Benoit
%A Servan-Schreiber, David
%D September 1983
%X Two common barriers against the development and diffusion of
               Expert Systems in Medicine are the difficulty of design and
               the low level of acceptance. This paper reports on an
               original experience which entails potential solutions of
               these issues: the task of Knowledge Engineering is performed
               by medical students and residents on a hospital ward using a
               sophisticated Knowledge Acquisition System, EMYCIN. The
               Knowledge Engineering sessions are analysed in detail and a
               structured method is proposed. A transcript of a sample run
               of the resulting program is presented along with an
               evaluation of its performance, acceptance, educational
               potential and amount of endeavour required. The impact of the
               Knowledge Engineering process itself is then assessed both
               from the residents and the medical students standpoint.
               Finally, the possibility of generalizing the experiment is
               examined.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/998/CS-TR-82-998.pdf

%R CS-TR-82-892
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An algorithm for reducing acyclic hypergraphs
%A Kuper, Gabriel M.
%D January 1982
%X This report is a description of an algorithm to compute
               efflciently the Graham reduction of an acyclic hypergraph
               with sacred nodes. To apply the algorithm we must already
               have a tree representation of the hypergraphs, and therefore
               it is useful when we have a fixed hypergraph and wish to
               compute Graham reductions many times, as we do in the
               Systern/U query interpretation algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/892/CS-TR-82-892.pdf

%R CS-TR-82-895
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T GLISP users' manual
%A Novak, Gordon S., Jr.
%D January 1982
%X GLISP is a high-level, LISP-based language which is compiled
               into LISP. GLISP provides a powerful abstract datatype
               facility, allowing description and use of both LISP objects
               and objects in A.I. representation languages. GLISP language
               features include PASCAL-like control structures, infix
               expressions with operators which facilitate list
               manipulation, and reference to objects in PASCAL-like or
               English-like syntax. English-like definite reference to
               features of objects which are in the current computational
               context is allowed; definite references are understood and
               compiled relative to a knowledge base of object descriptions.
               Object-centered programming is supported; GLISP can
               substantially improve runtime performance of object-centered
               programs by optimized compilation of references to objects.
               This manual describes the GLISP language and use of GLISP
               within INTERLISP.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/895/CS-TR-82-895.pdf

%R CS-TR-82-903
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Coloring maps and the Kowalski doctrine
%A McCarthy, John
%D April 1982
%X It is attractive to regard an algorithm as composed of the
               logic determining what the results are and the control
               determining how the result is obtained. Logic programmers
               like to regard programming as controlled deduction, and there
               have been several proposals for controlling the deduction
               expressed by a Prolog program and not always using Prolog's
               normal backtracking algorithm. The present note discusses a
               map coloring program proposed by Pereira and Porto and two
               coloring algorithms that can be regarded as control applied
               to its logic. However, the control mechanisms required go far
               beyond those that have been contemplated in the Prolog
               literature.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/903/CS-TR-82-903.pdf

%R CS-TR-82-908
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Neomycin: reconfiguring a rule-based expert system for
               application to teaching
%A Clancey, William J.
%A Letsinger, Reed
%D May 1982
%X NEOMYClN is a medical consultation system in which MYClN's
               knowledge base is reorganized and extended for use in GUIDON,
               a teaching program. The new system constitutes a
               psychological model for doing diagnosis designed to provide a
               basis for interpreting student behavior and teaching
               diagnostic strategy. The model separates out kinds of
               knowledge that are procedurally embedded in MYClN's rules and
               so inaccessible to the teaching program. The key idea is to
               represent explicitly and separately a domain-independent
               diagnostic strategy in the form of meta-rules, knowledge
               about the structure of the problem space, causal and
               data/hypothesis rules and world facts.
               As a psychological model, NEOMYCIN captures the
               forward-directed, "compiled associations" mode of reasoning
               that characterizes expert behavior. Collection and
               interpretation of data are focused by the "differential" or
               working memory of hypotheses. Moreover, the knowledge base is
               broadened so that GUIDON can teach a student when to consider
               a specific infectious dlsease and what competing hypotheses
               to consider, essentially the knowledge a human would need in
               order to use the MYCIN consultation system properly.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/908/CS-TR-82-908.pdf

%R CS-TR-82-909
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Plan recognition strategies in student modeling: prediction
               and description
%A London, Bob
%A Clancey, William J.
%D May 1982
%X This paper describes the student modeler of the GUIDON2
               tutor, which understands plans by a dual search strategy. It
               first produces multiple predictions of student behavior by a
               model-driven simulation of the expert. Focused, data-driven
               searches then explain incongruities. By supplementing each
               other, these methods lead to an efficient and robust plan
               understander for a complex domain.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/909/CS-TR-82-909.pdf

%R CS-TR-82-910
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Exploration of Teaching and Problem-Solving Strategies,
               1979-1982
%A Clancey, William J.
%A Buchanan, Bruce
%D May 1982
%X This is the final report for Contract N-00014-79-C-0302,
               covering the period of 15 March 1979 through 14 March 1982.
               The goal of the project was to develop methods for
               representing teaching and problem-solving knowledge in
               computer-based tutorial systems. One focus of the work was
               formulation of principles for managing a case method tutorial
               dialogue; the other major focus was investigation of the use
               of a production rule representation for the subject material
               of a tutorial program. The main theme pursued by this
               research is that representing teaching and problem-solving
               knowledge separately and explicitly enhances the ability to
               build, modify and test complex tutorial programs.
               Two major computer programs were constructed. One was the
               tutorial program, GUIDON, which uses a set of explicit
               "discourse procedures" for carrying on a case method dialogue
               with a student. GUIDON uses the original MYCIN knowledge base
               as subject material, and as such, was an experiment in
               exploring the ways in which production rules can be used in
               tutoring. GUlDON's teaching knowledge is separate from and
               compatible with any knowledge base that is encoded in MYClN's
               rule language. Demonstrations of GUIDON were given for two
               medical and one engineering application. Thus, the generality
               of this kind of system goes beyond being able to teach about
               any problem in a "case library"--it also allows teaching
               expertise to be transferred and tested in multiple problem
               domains.
               The second major program is the consultation program,
               NEOMYCIN. This is a second generation system in which MYClN's
               knowledge has been reconfigured to make explicit distinctions
               that are important for teaching. Unlike MYCIN, the program
               uses the hypothesis-oriented approach and predominantly
               forward-directed reasoning. As such, NEOMYCIN is consistent
               with and extends psychological models of diagnostic
               problem-solving. The program differs from other
               knowledge-based Al systems in that reasoning is completely
               controlled by a set of explicit meta-rules. These meta-rules
               are domain independent and constitute the diagnostic
               procedure to be taught to students: the tasks of diagnosis
               and heuristics for attending to and confirming relevant
               diagnostic hypotheses.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/910/CS-TR-82-910.pdf

%R CS-TR-82-911
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Bibliography of Stanford Computer Science reports, 1963-1982
%A Roberts, Barbara J.
%A Marashian, Irris
%D May 1982
%X This report lists, in chronological order, all reports
               published by the Stanford Computer Science Department since
               1963. Each report is identified by a Computer Science number,
               author's name, title, National Technical Information Service
               (NTIS) retrieval number (i.e., AD-XXXXXX), date, and number
               of pages. If the NTIS number is not given, it means that the
               report is probably not available from NTIS.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/911/CS-TR-82-911.pdf

%R CS-TR-82-912
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The implication and finite implication problems for typed
               template dependencies
%A Varde, Moshe Y.
%D May 1982
%X The class of typed template dependencies is a class of data
               dependencies that includes embedded multivalued and join
               dependencies. We show that the implication and the finite
               implication problems for this class are unsolvable. An
               immediate corollary is that this class has no formal system
               for finite implication. We also show how to construct a
               finite set of typed template dependencies whose implication
               and finite implication problems are unsolvable.
               The class of projected join dependencies is a proper subclass
               of the above class, and it generalizes slightly embedded join
               dependencies. It is shown that the implication and the finite
               implication problems for this class are also unsolvable. An
               immediate corollary is that this class has no
               universe-bounded formal system for either impllication or
               finite implication.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/912/CS-TR-82-912.pdf

%R CS-TR-82-914
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Using string matching to compress Chinese characters
%A Guoan, Gu
%A Hobby, John
%D May 1982
%X A new method for font compression is introduced and compared
               to existing methods. A very compact representation is
               achieved by using a variant of McCreight's string matching
               algorithm to compress the bounding contour. Results from an
               actual implementation are given showing the improvement over
               other methods and how this varies with resolution and
               character complexity. Compression ratios of up to 150 are
               achieved for Chinese characters.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/914/CS-TR-82-914.pdf

%R CS-TR-82-915
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Verification of concurrent programs: proving eventualities by
               well-founded ranking
%A Manna, Z ohar
%A Pnueli, Amir
%D May 1982
%X In this paper, one of a series on verification of concurrent
               programs, we present proof methods for establishing
               eventuality and until properties. The methods are based on
               well-founded ranking and are applicable to both "just" and
               "fair" computations. These methods do not assume a decrcase
               of the rank at each computation step. It is sufficient that
               there exists one process which decreases the rank when
               activated. Fairness then ensures that the program will
               eventually attain its goal.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/915/CS-TR-82-915.pdf

%R CS-TR-82-922
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An approach to verifying completeness and consistency in a
               rule-based expert system
%A Suwa, Motoi
%A Scott, A. Carlisle
%A Shortliffe, Edward H.
%D June 1982
%X We describe a program for verifying that a set of rules in an
               expert system comprehensively spans the knowledge of a
               specialized domain. The program has been devised and tested
               within the context of the ONCOCIN System, a rule-based
               consultant for clinical oncology. The stylized format of
               ONCOCIN's rules has allowed the automatic detection of a
               number of common errors as the knowledge base has been
               developed. This capability suggests a general mechanism for
               correcting many problems with knowledge base completeness and
               consistency before they can cause performance errors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/922/CS-TR-82-922.pdf

%R CS-TR-82-923
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Explanatory power for medical expert systems: studies in the
               representation of causal relationships for clinical
               consultations
%A Wallis, Jerold W.
%A Shortliffe, Edward H.
%D July 1982
%X This paper reports on experiments designed to identify and
               implement mechanisms for enhancing the explanation
               capabilities of reasoning programs for medical consultation.
               The goals of an explanation system are discussed, as is the
               additional knowledge needed to meet these goals in a medical
               domain. We have focussed on the generation of explanations
               that are appropriate for different types of system users.
               This task requires a knowledge of what is complex and what is
               important; it is further strengthened by a classification of
               the associations or causal mechanisms inherent in the
               inference rules. A causal representation can also be used to
               aid in refining a comprehensive knowledge base so that the
               reasoning and explanations are more adequate. We describe a
               prototype system which reasons from causal inference rules
               and generates explanations that are appropriate for the user.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/923/CS-TR-82-923.pdf

%R CS-TR-82-926
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Principles of rule-based expert systems
%A Buchanan, Bruce G.
%A Duda, Richard O.
%D August 1982
%X Rule-based expert systems are surveyed. The most important
               considerations are representation and inference. Rule-based
               systems make strong assumptions about the representation of
               knowledge as conditional sentences and about the control of
               inference in one of three ways. The problem of reasoning with
               incomplete or inexact information is also discussed, as are
               several other issues regarding the design of expert systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/926/CS-TR-82-926.pdf

%R CS-TR-82-927
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Combining state machines and regular expressions for
               automatic synthesis of VLSI circuits
%A Ullman, Jeffrey D.
%D September 1982
%X We discuss a system for translating regular expressions into
               logic equations or PLA's, with particular attention to how we
               can obtain both the benefits of regular expressions and state
               machines as input languages. An extended example of the
               method is given, and the results of our approach is compared
               with hand design; in this example we use less than twice the
               area of a hand-designed, machine optimized PLA.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/927/CS-TR-82-927.pdf

%R CS-TR-82-928
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automated ambulatory medical record systems in the U.S.
%A Kuhn, Ingeborg M.
%A Wiederhold, Gio
%A Rodnick, Jonathan E.
%A Ramsey-Klee, Diane M.
%A Benett, Sanford
%A Beck, Donald D.
%D August 1982
%X This report presents an overview of the developments in
               Automated Ambulatory Medical Record Systems (AAMRS) from 1975
               to the present. A summary of findings from a 1975
               state-of-the-art review is presented along with the current
               findings of a follow-up study of a selected number of the
               AAMRS operating today. The studies revealed that effective
               automated medical record systems have been developed for
               ambulatory care settings and that they are now in the process
               of being transferred to other sites or users, either
               privately or as a commericial product. Since 1975 there have
               been no significant advances in system design. However,
               progress has been substantial in terms of achieving
               production goals. Even though a variety of systems are
               commercially available, there is a continuing need for
               research and development to improve the effectiveness of the
               systems in use today.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/928/CS-TR-82-928.pdf

%R CS-TR-82-931
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T PUFF: an expert system for interpretation of pulmonary
               function data
%A Aikins, Janice S.
%A Kunz, John C.
%A Shortliffe, Edward H.
%A Fallat, Robert J.
%D September 1982
%X The application of Artificial Intelligence techniques to
               real-world problems has produced promising research results,
               but seldom has a system become a useful tool in its domain of
               expertise. Notable exceptions are the DENDRAL and MOLGEN
               systems. This paper describes PUFF, a program that interprets
               lung function test data and has become a working tool in the
               pulmonary physiology lab of a large hospital. Elements of the
               problem that paved the way for its success are examined, as
               are significant limitations of the solution that warrant
               further study.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/931/CS-TR-82-931.pdf

%R CS-TR-82-932
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Expert systems research: modeling the medical decision making
               process
%A Shortliffe, Edward H.
%A Fagan, Lawrence M.
%D September 1982
%X During the quarter century since the birth of the branch of
               computer science known as artificial intelligence (AI), much
               of the research has focused on developing symbolic models of
               human inference. In the last decade several related AI
               research themes have come together to form what is now known
               as "expert systems research." In this paper we review AI and
               expert systems to acquaint the reader with the field and to
               suggest ways in which this research will eventually be
               applied to advanced medical monitoring.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/932/CS-TR-82-932.pdf

%R CS-TR-82-933
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An algorithmic method for studying percolation clusters
%A Klein, Shmuel T.
%A Shamir, Eli
%D September 1982
%X In percolation theory one studies configurations, based on
               some infinite lattice, where the sites of the lattice are
               randomly made F (full) with probability p or E (empty) with
               probability 1-p. For p > $p_c$, the set of configurations
               which contain an infinite cluster (a connectivity component)
               has probability 1. Using an algorithmic method and a
               rearrangement lemma for Bernoulli sequences, we compute the
               boundary-to-body quotient of infinite clusters and prove it
               has the definite value (1-p)/p with probability 1.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/933/CS-TR-82-933.pdf

%R CS-TR-82-947
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Modelling degrees of item interest for a general database
               query system
%A Rowe, Neil C.
%D April 1982
%X Many databases support decision-making. Often this means
               choices between alternatives according to partly subjective
               or conflicting criteria. Database query languages are
               generally designed for precise, logical specification of the
               data of interest, and tend to be awkward in the
               aforementioned circumstances. Information retrieval research
               suggests several solutions, but there are obstacles to
               generalizing these ideas to most databases.
               To address this problem we propose a methodology for
               automatically deriving and monitoring "degrees of interest"
               among alternatives for a user of a database system. This
               includes (a) a decision theory model of the value of
               information to the user, and (b) inference mechanisms, based
               in part on ideas from artificial intelligence, that can tune
               the model to observed user behavior. This theory has
               important applications to improving efficiency and
               cooperativeness of the interface between a decision-maker and
               a database system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/947/CS-TR-82-947.pdf

%R CS-TR-82-949
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The r-Stirling numbers
%A Broder, Andrei Z .
%D December 1982
%X The r-Stirling numbers of the first and second kind count
               restricted permutations and respectively restricted
               partitions, the restriction being that the first r elements
               must be in distinct cycles and respectively distinct subsets.
               The combinatorial and algebraic properties of these numbers,
               which is most cases generalize similar properties of the
               regular Stirling numbers, are explored starting from the
               above definition.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/949/CS-TR-82-949.pdf

%R CS-TR-82-950
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Learning physical description from functional definitions,
               examples and precedents
%A Winston, Patrick H.
%A Binford, Thomas O.
%A Katz, Boris
%A Lowry, Michael
%D January 1983
%X It is too hard to tell vision systems what things look like.
               It is easier to talk about purpose and what things are for.
               Consequently, we want vision systems to use functional
               descriptions to identify things, when necessary, and we want
               them to learn physical descriptions for themselves, when
               possible.
               This paper describes a theory that explains how to make such
               systems work. The theory is a synthesis of two sets of ideas:
               ideas about learning from precedents and exercises developed
               at MIT and ideas about physical description developed at
               Stanford. The strength of the synthesis is illustrated by way
               of representative experiments. All of these experiments have
               been performed with an implementation system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/950/CS-TR-82-950.pdf

%R CS-TR-82-951
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Five paradigm shifts in programming language design and their
               realization in Viron, a dataflow programming environment
%A Pratt, Vaughan
%D December 1982
%X We describe five paradigm shifts in programming language
               design, some old and some relatively new, namely Effect to
               Entity, Serial to Parallel, Partition Types to Predicate
               Types, Computable to Definable, and Syntactic Consistency to
               Semantic Consistency. We argue for the adoption of each. We
               exhibit a programming language, Viron, lhat capitalizes on
               these shifts.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/951/CS-TR-82-951.pdf

%R CS-TR-82-953
%Z Thu, 01 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Partial bibliography of work on expert systems
%A Buchanan, Bruce G.
%D December 1982
%X Since 1971 many publications on expert systems have appeared
               in conference proceedings and the technical literature. Over
               200 titles are listed in the bibliography. Many relevant
               publications are omitted because they overlap publications on
               the list; others should be called to my attention.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/82/953/CS-TR-82-953.pdf

%R CS-TR-81-836
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Verification of concurrent programs, Part I: The temporal
               framework
%A Manna, Z ohar
%A Pnueli, Amir
%D June 1981
%X This is the first in a series of reports describing the
               application of temporal logic to the specification and
               verification of concurrent programs.
               We first introduce temporal logic as a tool for reasoning
               about sequences of states. Models of concurrent programs
               based both on transition graphs and on linear-text
               representations are presented and the notions of concurrent
               and fair executions are defined.
               The general temporal language is then specialized to reason
               aboaut those execution sequences that are fair computations
               of a concurrent program. Subsequently, the language is used
               to describe properties of concurrent programs.
               The set of interesting properties is classified into
               invariance (safety), eventuality (liveness), and precedence
               (until) properties. Among the properties studied are: partial
               correctness, global invariance, clean behavior, mutual
               exclusion, absence of deadlock, termination, total
               correctness, intermittent assertions, accessibility,
               responsiveness, safe liveness, absence of unsolicited
               response, fair responsiveness, and precedence.
               In the following reports of this series, we will use the
               temporal formalism to develop proof methodologies for proving
               the properties discussed here.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/836/CS-TR-81-836.pdf

%R CS-TR-81-837
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Research on expert systems
%A Buchanan, Bruce G.
%D March 1981
%X All AI programs are essentially reasoning programs. And, to
               the extent that they reason well about a problem area, all
               exhibit some expertise at problem solving. Programs that
               solve the Tower of Hanoi puzzle, for example, reason about
               the goal state and the initial state in order to find
               "expert-level" solutions. Unlike other programs, however, the
               claims about expert systems are related to questions of
               usefulness and understandability as well as performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/837/CS-TR-81-837.pdf

%R CS-TR-81-838
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Dynamic program building
%A Brown, Peter
%D February 1981
%X This report argues that programs are better regarded as
               dynamic running objects rather than as static textual ones.
               The concept of dynamic building, whereby a program is
               constructed as it runs, is described. The report then
               describes the Build system, which is an implementation of
               dynamic building for an interactive algebraic programming
               language. Dynamic building aids the locating of run-time
               errors, and is especially valuable in environments where
               programs are relatively short but run-time errors are
               frequent and/or costly.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/838/CS-TR-81-838.pdf

%R CS-TR-81-839
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Short WAITS
%A Samuel, Arthur L.
%D February 1981
%X This is an introductory manual describing the SU-AI
               timesharing system that is available primarily for sponsored
               research in the Computer Science Department. The present
               manual is written for the beginner and the user interested
               primarily in the message handling capability as well as for
               the experienced computer user and programmer who either is
               unfamiliar with the SU-AI computer or who uses it
               infrequently. References are made to the available hard-copy
               manuals and to the extensive on-line document files where
               more complete information can be obtained.
               The principal advantages of this system are:
               1) The availability of a large repertoire of useful system
               features; 2) The large memory; 3) The large file storage
               system; 4) The ease with which one can access other computers
               via the ARPA net; 5) The file transfer facilities via the
               EFTP program and the ETHERNET; 6) The XGP and the DOVER
               printers and the large collections of fonts available for
               them; and 7) The fast and convenient E editor with its macro
               facilities.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/839/CS-TR-81-839.pdf

%R CS-TR-81-846
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Byzantine Generals strike again
%A Dolev, Danny
%D March 1981
%X Can unanimity be achieved in an unreliable distributed
               system? This problem was named "The Byzantine Generals
               Problem," by Lamport, Pease and Shostak [1980]. The results
               obtained in the present paper prove that unanimity is
               achievable in any distributed system if and only if the
               number of faulty processors in the system is:
               1) less than one third of the total number of processors; and
               2) less than one half of the connectivity of the system's
               network.
               In cases where unanimity is achievable, algorithms to obtain
               it are given. This result forms a complete characterization
               of networks in light of the Byzantine Problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/846/CS-TR-81-846.pdf

%R CS-TR-81-847
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The optimal locking problem in a directed acyclic graph
%A Korth, Henry F.
%D March 1981
%X We assume a multiple granularity database locking scheme
               similar to that of Gray, et al. [197S] in which a rooted
               directed acyclic graph is used to represent the levels of
               granularity. We prove that even if it is known in advance
               exactly what database references the transaction will make,
               it is NP-complete to find the optimal locking strategy for
               the transaction.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/847/CS-TR-81-847.pdf

%R CS-TR-81-848
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the problem of inputting Chinese characters
%A Tang, Chih-sung
%D April 1981
%X If Chinese-speaking society is to make the best use of
               computers, it is important to develop an easy, quick, and
               convenient way to input Chinese characters together with
               other conventional characters. Many people have tried to
               approach this problem by designing special typewriters for
               Chinese character input, but such methods have serious
               deficiencies and they do not take advantage of the fact that
               the input process is just part of a larger system in which a
               powerful computer lies behind the keyboard. The purpose of
               this note is to clarify the problem and to illustrate a
               promising solution based entirely on a standard ASCII
               keyboard.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/848/CS-TR-81-848.pdf

%R CS-TR-81-849
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Experiments on the Knee Criterion in a multiprogrammed
               computer system
%A Nishigaki, Tohru
%D March 1981
%X Although the effectiveness of the Knee Criterion as a virtual
               memory management strategy is widely accepted, it has been
               impossible to take advantage of it in a practical system,
               because little information is available about the program
               behavior of executing jobs.
               A new memory management technique to achieve the Knee
               Criterion in a multiprogrammed virtual memory system is
               developed. The technique, termed the Optimum Working-set
               Estimator (OWE), abstracts the programs' behavior from their
               past histories by exponential smoothing, and modifies their
               working set window sizes in order to attain the Knee
               Criterion.
               The OWE method was implemented and investigated. Measurements
               demonstrate its ability to control a variety of jobs.
               Furthermore, the results also reveal that the throughput
               improvement is possible in a space-squeezing environment.
               This technique is expected to increase the efficiency of
               multiprogrammed virtual memory systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/849/CS-TR-81-849.pdf

%R CS-TR-81-851
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Binding in information processing
%A Wiederhold, Gio
%D May 1981
%X The concept of binding, as used in programming systems, is
               analyzed and defined in a number of contexts. The attributes
               of variables to be bound and the phases of binding are
               enumerated.
               The definition is then broadened to cover general issues in
               information systems. Its applicability is demonstrated in a
               wide range of system design and implementation issues. A
               number of Database Management Systems are categorized
               according to the terms defined. A first-order quantitative
               model is developed and compared with current practice. The
               concepts and the model are considered helpful when used as a
               tool for the global design phase of large information
               systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/851/CS-TR-81-851.pdf

%R CS-TR-81-854
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the security of public key protocols
%A Dolev, Danny
%A Yao, Andrew C.
%D May 1981
%X Recently, the use of public key encryption to provide secure
               network communication has received considerable attention.
               Such public key systems are usually effective against passive
               eavesdroppers, who merely tap the lines and try to decipher
               the message. It has been pointed out, however, that an
               improperly designed protocol could be vulnerable to an active
               saboteur, one who may impersonate another user or alter the
               message being transmitted. In this paper we formulate several
               models in which the security of protocols can be discussed
               precisely. Algorithms and characterizations that can be used
               to determine protocol security in these models will be given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/854/CS-TR-81-854.pdf

%R CS-TR-81-863
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming and problem-solving seminar
%A Knuth, Donald E.
%A Miller, Allan A.
%D June 1981
%X This report contains a record of the autumn 1980 session of
               CS 204, a problem-solving and programming seminar taught at
               Stanford that is primarily intended for first-year Ph.D.
               students. The seminar covers a large range of topics,
               research paradigms, and programming paradigms in computer
               science, so these notes will be of interest to graduate
               students, professors, and professional computer scientists.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/863/CS-TR-81-863.pdf

%R CS-TR-81-865
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Toward a unified logical basis for programming languages
%A Tang, Chih-sung
%D June 1981
%X In recent years, more and more computer scientists have been
               paying attention to temporal logic, since there are many
               properties of programs that can be described only by bringing
               the time parameter into consideration. But existing temporal
               logic languages, such as Lucid, in spite of their
               mathematical elegance, are still far from practical. I
               believe that a practical temporal-logic language, once it
               came into being, would have a wide spectrum of applications.
               XYZ /E is a temporal-logic language. Like other logic
               languages, it is a logic system as well as a programming
               language. But unlike them, it can express all conventional
               data structures and control structures, nondeterminate or
               concurrent programs, even programs with branching-time order.
               We find that the difficulties met in other logic languages
               often stem from the fact that they try to deal with these
               structures in a higher level. XYZ /E adopts another approach.
               We divide the language into two forms: the internal form and
               the external form. The former is lower level, while the
               latter is higher. Just as any logic system contains rules of
               abbreviation, so also in XYZ /E there are rules of
               abbreviation to transform the internal form into the external
               form, and vice versa. These two forms can be considered to be
               different representations of the same thing. We find that
               this approach can ameliorate many problems of formalization.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/865/CS-TR-81-865.pdf

%R CS-TR-81-867
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ADAM - an Ada based language for multi-processing
%A Luckham, David C.
%A Larsen, Howard J.
%A Stevenson, David R.
%A Henke, Friedrich W. von
%D July 1981
%X Adam is an experimental language derived from Ada. It was
               developed to facilitate study of issues in Ada
               implementation. The two primary objectives which motivated
               the development of Adam were: to program supervisory packages
               for multitask scheduling, and to formulate algorithms for
               compilation of Ada tasking.
               Adam is a subset of the sequential program constructs of Ada
               combined wlth a set of parallel processing constructs which
               are lower level than Ada tasking. In addition, Adam places
               strong restrictions on sharing of global objects between
               processes. Import declarations and propagate declarations are
               included.
               A compiler has been implemented in Maclisp on a DEC PDP-10.
               It produces assembly code for a PDP-10. It supports separate
               compilatlon, generics, exceptions, and parallel processes.
               Algorithms translating Ada tasking into Adam parallel
               processing have been developed and implemented. An
               experimental compiler for most of the final Ada language
               design, including task types and task rendezvous constructs,
               based on the Adam compiler, is presently available on
               PDP-10's. This compiler uses a procedure call implementation
               of task rendezvous, but wlll be used to develop and study
               alternate implementatlons.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/867/CS-TR-81-867.pdf

%R CS-TR-81-868
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Last Whole Errata Catalog
%A Knuth, Donald E.
%D July 1981
%X This list supplements previous errata published in Stanford
               reports CS551 (1976) and CS712 (1979). It includes the first
               corrections and changes to the second edition of volume two
               (published January, 1981) as well as to the most recent
               printings of volumes one and three (first published in 1975).
               In addition to the errors listed here, about half of the
               occurrences of 'which' in volumes one and three should be
               changed to 'that'.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/868/CS-TR-81-868.pdf

%R CS-TR-81-869
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computer Science comprehensive examinations, 1978/79-1980/81
%A Tajnai, Carolyn E.
%D August 1981
%X The Stanford Computer Science Comprehensive Examination was
               conceived Spring Quarter 1971/72 and since then has been
               given winter and spring quarters each year. The 'Comp' serves
               several purposes in the department. There are no course
               requirements in the Ph.D. and the Ph.D. Minor programs, and
               only one (CS293, Computer Laboratory) in the Master's
               program. Therefore, the 'Comp' fulfills the breadth and depth
               requirements. The Ph.D. Minor and Master's student must pass
               at the Master's level to be eligible for the degree. For the
               Ph.D. student it serves as a "Rite of Passage"; the exam must
               be passed at the Ph.D. level by the end of six quarters of
               full-time study (excluding summers) for the student to
               continue in the program.
               This report is a collection of comprehensive examinations
               from Winter Quarter 1978/79 through Spring Quarter 1980/81.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/869/CS-TR-81-869.pdf

%R CS-TR-81-871
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Good layouts for pattern recognizers
%A Trickey, Howard W.
%D August 1981
%X A system to lay out custom circuits that recognize regular
               languages can be a useful VLSI design automation tool. This
               paper describes the algorithms used in an implementation of a
               regular expression compiler. Layouts that use a network of
               programmable logic arrays (PLA's) have smaller areas than
               those of some other methods, but there are the problems of
               partitioning the circuit and then placing the individual
               PLA's. Regular expressions have a structure which allows a
               novel solution to these problems: dynamic programming can be
               used to find layouts which are in some sense optimal. Various
               search pruning heuristics have been used to increase thc
               speed of the compiler, and the experience with these is
               reported in the conclusions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/871/CS-TR-81-871.pdf

%R CS-TR-81-875
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computation of matrix chain products: Part I, Part II
%A Hu, T. C.
%A Shing, M. T.
%D September 1981
%X This paper considers the computation of matrix chain products
               of the form $M_1 x M_2 x ... x M_{n-1}$. If the matrices are
               of different dimensions, the order in which the product is
               computed affects the number of operations. An optimum order
               is an order which minimizes the total number of operations.
               Some theorems about an optimum order of computing the
               matrices are presented in part I. Based on these theorems, an
               O(n log n) algorithm for finding an optimum order is
               presented in part II.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/875/CS-TR-81-875.pdf

%R CS-TR-81-876
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On linear area embedding of planar graphs
%A Dolev, Danny
%A Trickey, Howard W.
%D September 1981
%X Planar embedding with minimal area of graphs on an integer
               grid is one of the major issues in VLSI. Valiant [1981] gave
               an algorithm to construct a planar embedding for trees in
               linear area; he also proved that there are planar graphs that
               require quadratic area.
               We give an algorithm to embed outerplanar graphs in linear
               area. We extend this algorithm to work for every planar graph
               that has the following property: for every vertex there
               exists a path of length less than K to the exterior face,
               where K is a constant.
               Finally, finding a minimal embedding area is shown to be
               NP-complete for forests, and hence for more general types of
               graphs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/876/CS-TR-81-876.pdf

%R CS-TR-81-879
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Interlisp-VAX: a report
%A Masinter, Larry M.
%D August 1981
%X This report documents the results of a study to evaluate the
               feasibility of implementing the Interlisp language to run on
               the DEC VAX computer. Specific goals of the study were to: 1)
               assess the technical status of the on-going implementation
               project at USC-ISI; 2) estimate the expected performance of
               Interlisp on the VAX famility of machines as compared to
               Interlisp-10, other Lisp systems for the VAX, and other
               Interlisp implementations where performance data were
               available; and 3) identify serious obstacles and alternatives
               to the timely completion of an effective Interlisp-VAX
               system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/879/CS-TR-81-879.pdf

%R CS-TR-81-880
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Well structured parallel programs are not easier to schedule
%A Mayr, Ernst W.
%D September 1981
%X The scheduling problem for unit time task systems with
               arbitrary precedence constrainls is known to be NP-complete.
               We show that the same is true even if the precedence
               constraints are restricted to certain subclasses which make
               the corresponding parallel programs more structured. Among
               these classes are those derived from hierarchic cobegin-coend
               programming constructs, level graph forests, and the parallel
               or serial composition of an out-tree and an in-tree. In each
               case, the completeness proof depends heavily on the number of
               processors being part of the problem instances.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/880/CS-TR-81-880.pdf

%R CS-TR-81-883
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On program transformations for abstract data types and
               concurrency
%A Pepper, P.
%D October 1981
%X We study transformation rules for a particular class of
               abstract data types, namely types that are representable by
               recursive mode declarations. The transformations are tailored
               to the development of efficient tree traversal and they allow
               for concurrency. The techniques are exemplified by an
               implementation of concurrent insertion and deletion in
               2-3-trees.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/883/CS-TR-81-883.pdf

%R CS-TR-81-887
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Finding the convex hull of a simple polygon
%A Graham, Ronald L.
%A Yao, Frances
%D November 1981
%X It is well known that the convex hull of a set of n points in
               the (Euclidean) plane can be found by an algorithm having
               worst-case complexity O(n log n). In this note we give a
               short linear algorithm for finding the convex hull in the
               case that the (ordered) set of points from the vertices of a
               simple (i.e., non-self-intersecting) polygon.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/887/CS-TR-81-887.pdf

%R CS-TR-81-889
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T AL users' manual
%A Mujtaba, Shahid
%A Goldman, Ron
%D December 1981
%X AL is a high-level programming language for manipulator
               control useful in industrial assembly research. This document
               describes the current state of the AL system now in operation
               at the Stanford Artificial Intelligence Laboratory, and
               teaches the reader how to use it. The system consists of the
               AL compiler and runtime system and the source code
               interpreter, POINTY, which facilitates specifying
               representation of parts, and interactive execution of AL
               statements.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/889/CS-TR-81-889.pdf

%R CS-TR-81-894
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Methodology for building an intelligent tutoring system
%A Clancey, William J.
%D October 1981
%X Over the past 6 years we have been developing a computer
               program to teach medical diagnosis. Our research synthesizes
               and extends results in artlficlal intelligence (Al),
               medicine, and cognitive psychology. This paper describes the
               progression of the research, and explalns how theories from
               these fields are combined in a computational model. The
               general problem has been to develop an "intelligent tutoring
               system" by adapting the MYCIN "expert system." Thls
               conversion requires a deeper understanding of the nature of
               expertise and explanatlon than origlnally requlred for
               developlng MYCIN, and a concomitant shift in perspective from
               slmple performance goals to attaining psychologlcal validity
               in the program's reasoning process.
               Others have written extensively about the relatlon of
               artificlal intelligence to cognltive sclence (e.g.,
               [Pylyshyn, 1978] [Boden, 1977]). Our purpose here is not to
               repeat those arguments, but to present a case study which
               will provide a common point for further dlscusslon. To this
               end, to help evaluate the state of cognitive science, we will
               outline our methodology and survey what resources and
               viewpoints have helped our research. We will also discuss
               pitfalls that other Al-oriented cognitive scientists may
               encounter. Finally, we will present some questions coming out
               of our work whlch might suggest possible collaboration with
               other fields of research.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/894/CS-TR-81-894.pdf

%R CS-TR-81-896
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The epistemology of a rule-based expert system: a framework
               for explanation
%A Clancey, William J.
%D November 1981
%X Production rules are a popular representation for encoding
               heuristic knowledge in programs for scientific and medical
               problem solving. However, experience with one of these
               programs, MYCIN, indicates that the representation has
               serious limitations: people other than the original rule
               authors find it difficult to modify the rule set, and the
               rules are unsuitable for use in other settings, such as for
               application to teaching. These paroblems are rooted in
               fundamental limitations in MYCIN's original rule
               representation: the view that expert knowledge can be encoded
               as a uniform, weakly-structured set of if/then associations
               is found to be wanting.
               To illustrate these problems, this paper examines MYCIN's
               rujles from the perspective of a teacher trying to justify
               them and to convey a problem-solving approach. We discover
               that individual rules play different roles, have different
               kinds of justifications, and are constructed using different
               rationales for the ordering and choice of premise clauses.
               This design knowledge, consisting of structural and strategic
               concepts which lie outside the representation, is shown to be
               procedurally embedded in the rules. Moreover, because the
               data/hypothesis associations are themselves a proceduralized
               form of underlying disease models, they can only be supported
               by appealing to this deeper level of knowledge. Making
               explicit this structural, strategic and support knowledge
               enhances the ability to understand and modify the system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/896/CS-TR-81-896.pdf

%R CS-TR-81-898
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Separability as a physical database design methodology
%A Whang, Kyu-Young
%A Wiederhold, Gio
%A Sagalowicz, Daniel
%D October 1981
%X A theoretical approach to the optimal design of large
               multifile physical databases is presented. The design
               algorithm is based on the theory that, given a set of join
               methods that satisfy a certain property called
               "separability," the problem of optimal assignment of access
               structures to the whole database can be reduced to the
               subproblem of optimizing individual relations independently
               of one another. Coupling factors are defined to represent all
               the interactions among the relations. This approach not only
               reduces the complexity of the problem significantly, but also
               provides a better understanding of underlying mechanisms.
               A closed noniterative formula is introduced for estimating
               the number of block accesses in a database organization, and
               the error analyzed. This formula, an approximation of Yao's
               exact formula, has a maximum error of 3.7%, and significantly
               reduces the computation time by eliminating the iterative
               loop. It also achieves a much higher accuracy than an
               approximation proposed by Cardenas.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/81/898/CS-TR-81-898.pdf

%R CS-TR-80-779
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Problematic features of programming languages: a
               situational-calculus approach
%A Manna, Z ohar
%A Waldinger, Richard J.
%D September 1980
%X Certain features of programming languages, such as data
               structure operations and procedure call mechanisms, have been
               found to resist formalization by classical techniques. An
               alternate approach is presented, based on a "situational
               calculus," which makes explicit reference to the states of a
               computation. For each state, a distinction is drawn between
               an expression, its value, and the location of the value.
               Within this conceptual framework, the features of a
               programming language can be described axiomatically. Programs
               in the language can then be synthesized, executed, verified,
               or transformed by performing deductions in this axiomatic
               system. Properties of entire classes of programs, and of
               programming languages, can also be expressed and proved in
               this way. The approach is amenable to machine implementation.
               In a situational-calculus formalism it is possible to model
               precisely many "problematic" features of programming
               langauges, including operations on such data structures as
               arrays, pointers, lists, and records, and such procedure call
               mechanisms as call-by-reference, call-by-value, and
               call-by-name. No particular obstacle is presented by aliasing
               between variables, by declarations, or by recursive
               procedures.
               The paper is divided into three parts, focusing respectively
               on the assignment statement, on data structure operations,
               and on procedure call mechanisms. In this first part, we
               introduce the conceptual framework to be applied throughout
               and present the axiomatic definition of the assignment
               statement. If suitable restrictions on the programming
               language are imposed, the well-known Hoare assignment axiom
               can then be proved as a theorem. However, our definition can
               also describe the assignment statement of unrestricted
               programming languages, for which the Hoare axiom does not
               hold.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/779/CS-TR-80-779.pdf

%R CS-TR-80-780
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Computer Modern family of typefaces
%A Knuth, Donald E.
%D January 1980
%X This report gives machine-independent definitions of all the
               styles of type planned for use in future editions of "The Art
               of Computer Programming." Its main purpose is to provide a
               detailed example of a complete family of font definitions
               using METAFONT, so that people who want new symbols for their
               own books and papers will understand how to incorporate them
               easily. The fonts are intended to have the same spirit as
               those used in earlier editions of "The Art of Computer
               Programming," but each character has been redesigned and
               defined in the METAFONT idiom. It is hoped that some readers
               will be inspired to make similar definitions of other
               important familites of fonts. The bulk of this report
               consists of about 400 short METAFONT programs for the various
               symbols needed, and as such it is pretty boring, but there
               are some nice illustrations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/780/CS-TR-80-780.pdf

%R CS-TR-80-785
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Equations and rewrite rules: a survey
%A Huet, Gerard
%A Oppen, Derek C.
%D January 1980
%X Equations occur frequently in mathematics, logic and computer
               science. In this paper, we survey the main results concerning
               equations, and the methods available for reasoning about them
               and computing with them. The survey is self-contained and
               unified, using traditional abstract algebra.
               Reasoning about equations may involve deciding if an equation
               follows from a given set of equations (axioms), or if an
               equation is true in a given theory. When used in this manner,
               equations state properties that hold between objects.
               Equations may also be used as definitions; this use is well
               known in computer science: programs written in applicative
               languages, abstract interpreter definitions, and algebraic
               data type definitions are clearly of this nature. When these
               equations are regarded as oriented "rewrite rules," we may
               actually use them to compute.
               In addition to covering these topics, we discuss the problem
               of "solving" equations (the "unification" problem), the
               problem of proving termination of sets of rewrite rules, and
               the decidability and complexity of word problems and of
               combinations of equational theories. We restrict ourselves to
               first-order equations, and do not treat equations which
               define non-terminating computations or recent work on rewrite
               rules applied to equational congruence classes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/785/CS-TR-80-785.pdf

%R CS-TR-80-786
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Algorithms in modern mathematics and computer science
%A Knuth, Donald E.
%D January 1980
%X The life and work of the ninth century scientist
               al-Khwarizmi, "the father of algebra and algorithms," is
               surveyed briefly. Then a random sampling technique is used in
               an attempt to better understand the kinds of thinking that
               good mathematicians and computer scientists do and to analyze
               whether such thinking is significantly "algorithmic" in
               nature. (This is the text of a talk given at the opening
               session of a symposium on "Algorithms in Modern Mathamatics
               and Computer Science" held in Urgench, Khorezm Oblast', Uzbek
               S.S.R., during the week of September 16-22, 1979.)
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/786/CS-TR-80-786.pdf

%R CS-TR-80-788
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Circumscription - a form of non-monotonic reasoning
%A McCarthy, John
%D February 1980
%X Humans and intelligent computer programs must often jump to
               the conclusion that the objects they can determine to have
               certain properties or relations are the only objects that do.
               Circumscripion formalizes such conjectural reasoning.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/788/CS-TR-80-788.pdf

%R CS-TR-80-789
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ADA exceptions: specification and proof techniques
%A Luckham, David C.
%A Polak, Wolfgang
%D February 1980
%X A method of documenting exception propagation and handling in
               Ada programs is proposed. Exception propagation declarations
               are introduced as a new component of Ada specifications. This
               permits documentation of those exceptions that can be
               propagated by a subprogram. Exception handlers are documented
               by entry assertions. Axioms and proof rules for Ada
               exceptions are given. These rules are simple extensions of
               previous rules for Pascal and define an axiomatic semantics
               of Ada exceptions. As a result, Ada programs specified
               according to the method can be analysed by formal proof
               techniques for consistency with their specifications, even if
               they employ exception propagation and handling to achieve
               required results (i.e. non-error situations). Example
               verifications are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/789/CS-TR-80-789.pdf

%R CS-TR-80-790
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Databases in healthcare
%A Wiederhold, Gio
%D March 1980
%X This report defines database design and implementation
               technology as applicable to healthcare. The relationship of
               technology to various healthcare settings is explored, and
               the effectiveness on healthcare costs, quality and access is
               evaluated. A summary of relevant development directions is
               included.
               Detailed examples of 5 typical clinical applications (public
               health, clinical trials, clinical research, ambulatory care,
               and hospitals) are appended. There is an extended
               bibliography.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/790/CS-TR-80-790.pdf

%R CS-TR-80-792
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MAINSAIL implementation overview
%A Wilcox, Clark R.
%A Dageforde, Mary L.
%A Jirak, Gregory A.
%D March 1980
%X The MAINSAIL programming language and the supporting
               implementations have been developed over the past five years
               as an integrated approach to a viable machine-independent
               system suitable for the development of large, portable
               programs. Particular emphasis has been placed on minimizing
               the effort involved in moving the system to a new machine
               and/or operating system. For this reason, almost all of the
               compiler and runtime support is written in MAINSAIL, and is
               utilized in each implementation without alteration. This use
               of a high-level language to support its own implementation
               has proved to be a significant advantage in terms of
               documentation and maintenance, without unduly affecting the
               execution speed. This paper gives an overview of the compiler
               and runtime implementation strategies, and indicates what an
               implementation requires for the machine-dependent and
               operating-system-dependent parts.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/792/CS-TR-80-792.pdf

%R CS-TR-80-794
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Recent developments in the complexity of combinatorial
               algorithms
%A Tarjan, Robert Endre
%D June 1980
%X The last three years have witnessed several major advances in
               the area of combinatorial algorithms. These include improved
               algorithms for matrix multiplication and maximum network
               flow, a polynomial-time algorithm for linear programming, and
               steps toward a polynomial-time algorithm for graph
               isomorphism. This paper surveys these results and suggests
               directions for future research. Included is a discussion of
               recent work by the author and his students on dynamic
               dictionaries, network flow problems, and related questions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/794/CS-TR-80-794.pdf

%R CS-TR-80-796
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Essential E
%A Samuel, Arthur L.
%D March 1980
%X This is an introductory manual describing the
               display-oriented text editor E that is available on the
               Stanford A.I. Laboratory PDP-10 computer. The present manual
               is intended to be used as an aid for the beginner as well as
               for experienced computer users who either are unfamiliar with
               the E editor or use it infrequently. Reference is made to the
               two on-line manuals that help the beginner to get started and
               that provide a complete description of the editor for the
               experienced user.
               E is commonly used for writing computer programs and for
               preparing reports and memoranda. It is not a document editor,
               although it does provide some facilities for getting a
               document into a pleasing format. The primary emphasis is that
               of speed, both in terms of the number of key strokes required
               of the user and in terms of the demands made on the computer
               system. At the same time, E is easy to learn and it offers a
               large range of facilities that are not available on many
               editors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/796/CS-TR-80-796.pdf

%R CS-TR-80-797
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Read-only transactions in a distributed database
%A Garcia-Molina, Hector
%A Wiederhold, Gio
%D April 1980
%X A read-only transaction or query is a transaction which does
               not modify any data. Read-only transactions could be
               processed with general transaction processing algorithms, but
               in many cases it is more efficient to process read-only
               transactions with special algorithms which take advantage of
               the knowledge that the transaction only reads. This paper
               defines the various consistency and currency requirements
               that read-only transactions may have. The processing of the
               different classes of read-only transactions in a distributed
               database is discussed. The concept of R insularity is
               introduced to characterize both the read-only and update
               algorithms. Several simple update and read-only transaction
               processing algorithms are presented to illustrate how the
               query requirements and the update algorithms affect the
               read-only transaction processing algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/797/CS-TR-80-797.pdf

%R CS-TR-80-799
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Multidimensional additive spline approximation
%A Friedman, Jerome H.
%A Grosse, Eric
%A Stuetzle, Werner
%D May 1980
%X We describe an adaptive procedure that approximates a
               function of many variables by a sum of (univariate) spline
               functions $s_m$ of selected linear combinations $a_m \cdot x$
               of the coordinates $\theta (x) = \sum_{1\le m\le M} s_m (a_m
               \cdot x)$. The procedure is nonlinear in that not only the
               spline coefficients but also the linear combinations are
               optimized for the particular problem. The sample need not lie
               on a regular grid, and the approximation is affine invariant,
               smooth, and lends itself to graphical interpretation.
               Function values, derivatives, and integrals are cheap to
               evaluate.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/799/CS-TR-80-799.pdf

%R CS-TR-80-807
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Path-regular graphs
%A Matula, David W.
%A Dolev, Danny
%D June 1980
%X A graph is vertex-[edge-]path-regular if a list of shortest
               paths, allowing multiple copies of paths, exists where every
               pair of vertices are the endvertices of the same number of
               paths and each vertex [edge] occurs in the same number of
               paths of the list. The dependencies and independencies
               between the various path-regularity, regularity of degree,
               and symmetry properties are investigated. We show that every
               connected vertex-[edge-]symmetric graph is
               vertex-[edge-]path-regular, but not conversely. We show that
               the product of any two vertex-path-regular graphs is
               vertex-path-regular but not conversely, and the iterated
               product G x G x ... x G is edge-path-regular if and only if G
               is edge-path-regular. An interpretation of path-regular
               graphs is given regarding the efficient design of concurrent
               communication networks.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/807/CS-TR-80-807.pdf

%R CS-TR-80-808
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Final report: Basic Research in Artificial Intelligence and
               Foundations of Programming
%A McCarthy, John
%A Binford, Thomas O.
%A Luckham, David C.
%A Manna, Z ohar
%A Weyhrauch, Richard W.
%A Earnest, Les
%D May 1980
%X Recent research results are reviewed in the areas of formal
               reasoning, mathematical theory of computation, program
               verification, and image understanding.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/808/CS-TR-80-808.pdf

%R CS-TR-80-811
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An extended semantic definition of Pascal for proving the
               absence of common runtime errors
%A German, Steven M.
%D June 1980
%X We present an axiomatic definition of Pascal which is the
               logical basis of the Runcheck system, a working verifier for
               proving the absence of runtime errors such as arlthmetic
               overflow, array subscripting out of range, and accessing an
               uninitialized variable. Such errors cannot be detected at
               compile time by most compilers. Because the occurrence of a
               runtime error may depend on the values of data supplied to a
               program, techniques for assuring the absence of errors must
               be based on program specifications. Runcheck accepts Pascal
               programs documented with assertions, and proves that the
               specifications are consistent with the program and that no
               runtime errors can occur. Our axiomatic definition is similar
               to Hoare's axiom system, but it takes into account certain
               restrictions that have not been considered in previous
               definitions. For instance, our definition accurately models
               uninitialized variables, and requires a variable to have a
               well defined value before it can be accessed. The logical
               problems of introducing the concept of uninitialized
               variables are discussed. Our definition of expression
               evaluation deals more fully with function calls than previous
               axiomatic definitions.
               Some generalizations of our semantics are presented,
               including a new method for verifying programs with procedure
               and function parameters. Our semantics can be easily adopted
               to similar languages, such as ADA.
               One of the main potential problems for the user of a verifier
               is the need to write detailed, repetitious assertions. We
               develop some simple logical properties of our definition
               which are exploited by Runcheck to reduce the need for such
               detailed assertions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/811/CS-TR-80-811.pdf

%R CS-TR-80-821
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Semiantichains and unichain coverings in direct products of
               partial orders
%A West, Douglas B.
%A Tovey, Craig A.
%D September 1980
%X We conjecture a generalization of Dilworth's theorem to
               direct products of partial orders. In particular, we
               conjecture that the largest "semiantichain" and the smallest
               "unichain covering" have the same size. We consider a special
               class of semiantichains and unichain coverings and determine
               when equality holds for them. This conjecture implies the
               existence of k-saturated partitions. A stronger conjecture,
               for which we also prove a special case, implies the
               Greene-Kleitman result on simultaneous k and (k +
               1)-saturated partitions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/821/CS-TR-80-821.pdf

%R CS-TR-80-824
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T LCCD, a language for Chinese character design
%A Mei, Tung Yun
%D October 1980
%X LCCD is a computer system able to produce aesthetically
               pleasing Chinese characters for use on raster-oriented
               printing devices. It is analogous to METAFONT, in that the
               user writes a little program that explains how to draw each
               character; but it uses different types of simulated 'pens'
               that are more appropriate to the Chinese idiom, and it
               includes special scaling features so that a complex character
               can easily be built up from simpler ones, in an interactive
               manner. This report contains a user's manual for LCCD,
               together with many illustrative examples and a discussion of
               the algorithms used.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/824/CS-TR-80-824.pdf

%R CS-TR-80-826
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A database approach to communication in VLSI design
%A Wiederhold, Gio
%A Beetem, Anne
%A Short, Garrett
%D October 1980
%X This paper describes recent and planned work at Stanford in
               applying database technology to the problems of VLSI design.
               In particular, it addresses the issue of communication within
               a design's different representations and hierarchical levels
               in a multiple designer environment. We demonstrate the
               heretofore questioned utility of using commercial database
               systems, at least while developing a versatile, flexible, and
               generally efficient model and its associated communication
               paths. Completed work and results from initial work using DEC
               DBMS-20 is presented, including macro expansion within the
               database, and signalling of changes to higher structural
               levels. Considerable discussion regarding overall philosophy
               for continued work is also included.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/826/CS-TR-80-826.pdf

%R CS-TR-80-827
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the parallel computation for the knapsack problem
%A Yao, Andrew Chi-Chih
%D November 1980
%X We are interested in the complexity of solving the knapsack
               problem with n input real numbers on a parallel computer with
               real arithmetic and branching operations. A processor-time
               tradeoff constraint is derived; in particular, it is shown
               that an exponential number of processors have to be used if
               the problem is to be solved in time $t \le {\sqrt{n}}/2$.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/827/CS-TR-80-827.pdf

%R CS-TR-80-829
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The dinner table problem
%A Aspvall, Bengt
%A Liang, Frank M.
%D December 1980
%X This report contains two papers inspired by the "dinner table
               problem": If n people are seated randomly around a circular
               table for two meals, what is the probability that no two
               people sit together at both meals? We show that this
               probability approaches $e^{-2}$ as $n \rightarrow \infty$,
               and also give a closed form. We then observe that in many
               similar problems on permutations with restricted position,
               the number of permutations satisfying a given number of
               properties is approximately Poisson distributed. We
               generalize our asymptotic argument to prove such a limit
               theorem, and mention applications to the problems of
               derangements, menages, and the asymptotic number of Latin
               rectangles.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/829/CS-TR-80-829.pdf

%R CS-TR-80-830
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Two linear-time algorithms for five-coloring a planar graph
%A Matula, David W.
%A Shiloach, Yossi
%A Tarjan, Robert E.
%D November 1980
%X A "sequential processing" algorithm using bicolor interchange
               that five-colors an n vertex planar graph in $O(n^2)$ time
               was given by Matula, Marble, and Isaacson [1972]. Lipton and
               Miller used a "batch processing" algorithm with bicolor
               interchange for the same problem and achieved an improved O(n
               log n) time bound [1978]. In this paper we use graph
               contraction arguments instead of bicolor interchange and
               improve both the sequential processing and batch processing
               methods to obtain five-coloring algorithms that operate in
               O(n) time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/830/CS-TR-80-830.pdf

%R CS-TR-80-832
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Scheduling wide graphs
%A Dolev, Danny
%D December 1980
%X The problem of scheduling a partially ordered set of unit
               length tasks on m identical processors is known to be
               NP-complete. There are efficient algorithms for only a few
               special cases of this problem. In this paper we explore the
               relations between the structure of the precedence graph (the
               partial order) and optimal schedules. We prove that in
               finding an optimal schedule for certain systems it suffices
               to consider at each step high roots which belong to at most
               the m-1 highest components of the precedence graph. This
               result reduces the number of cases we have to check during
               the construction of an optimal schedule. Our method may lead
               to the development of linear scheduling algorithms for many
               practical cases and to better bounds for complex algorithms.
               In particular, in the case the precedence graph contains only
               inforest and outforest components, this result leads to
               efficient algorithms for obtaining an optimal schedule on two
               or three processors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/832/CS-TR-80-832.pdf

%R CS-TR-80-850
%Z Thu, 08 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Performing remote operations efficiently on a local computer
               network
%A Spector, Alfred Z .
%D December 1980
%X This paper presents a communication model for local networks,
               whereby processes execute generalized remote references that
               cause operations to be performed by remote processes. This
               remote reference/remote operation model provides a taxonomy
               of primitives that (1) are naturally useful in many
               applications and (2) can be efficiently implemented. The
               motivation for this work is our desire to develop systems
               architectures for local network based multiprocessors that
               support distributed applications requiring frequent
               interprocessor communication.
               After a section containing a brief overview, Section 2 of
               this paper discusses the remote reference/remote operation
               model. In it, we derive a set of remote reference types that
               can be supported by a communication system carefully
               integrated with the local network interface. The third
               section exemplifies a communication system that provides one
               remote reference type. These references (i.e., remote load,
               store, compare-and-swap, enqueue, and dequeue) take about 150
               microseconds, or 50 average instruction times, to perform on
               Xerox Alto computers connected by a 2.94 megabit Ethernet.
               The last section summarizes this work and proposes a complete
               implementation resulting in a highly efficient communication
               system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/850/CS-TR-80-850.pdf

%R CS-TR-80-768
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Causal nets or what is a deterministic computation
%A Gacs, Peter
%A Levin, Leonid A.
%D October 1980
%X We introduce the concept of causal nets - it can be
               considered as the most general and elementary concept of the
               history of a deterministic computation (sequential or
               parallel). Causality and locality are distinguished as the
               only important properties of nets representing such records.
               Different types of complexities of computations correspond to
               different geometrical characteristics of the corresponding
               causal nets - which have the advantage of being finite
               objects. Synchrony becomes a relative notion. Nets can have
               symmetries; therefore it will make sense to ask what can be
               computed from arbitrary symmetric inputs. Here, we obtain a
               complete group-theoretical characterization of the kind of
               symmetries that can be allowed in parallel computations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/80/768/CS-TR-80-768.pdf

%R CS-TR-78-709
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Design and analysis of a data structure for representing
               sorted lists
%A Brown, Mark R.
%A Tarjan, Robert E.
%D December 1978
%X In this paper we explore the use of 2-3 trees to represent
               sorted lists. We analyze the worst-case cost of sequences of
               insertions and deletions in 2-3 trees under each of the
               following three assumptions: (i) only insertions are
               performed; (ii) only deletions are performed; (iii) deletions
               occur only at the small end of the list and insertions occur
               only away from the small end. Our analysis leads to a data
               structure for representing sorted lists when the access
               pattern exhibits a (perhaps time-varying) locality of
               reference. This structure has many of the properties of the
               representation proposed by Guibas, McCreight, Plass, and
               Roberts [1977], but it is substantially simpler and may be
               practical for lists of moderate size.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/709/CS-TR-78-709.pdf

%R CS-TR-78-649
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T DENDRAL and Meta-DENDRAL: their applications dimension
%A Buchanan, Bruce G.
%A Feigenbaum, Edward A.
%D February 1978
%X The DENDRAL and Meta-DENDRAL programs assist chemists with
               data interpretation problems. The design of each program is
               described in the context of the chemical inference problems
               the program solves. Some chemical results produced by the
               programs are mentioned.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/649/CS-TR-78-649.pdf

%R CS-TR-78-651
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Proving termination and multiset orderings
%A Dershowitz, Nachum
%A Manna, Z ohar
%D March 1978
%X A common tool for proving the termination of programs is the
               well-founded set, a set ordered in such a way as to admit no
               infinite descending sequences. The basic approach is to find
               a termination function that maps the elements of the program
               into some well-founded set, such that the value of the
               termination function is continually reduced throughout the
               computation. All too often, the termination functions
               required are difficult to find and are of a complexity out of
               proportion to the program under consideration. However, by
               providing more sophisticated well-founded sets, the
               corresponding termination functions can be simplified.
               Given a well-founded set S, we consider multisets over S,
               "sets" that admit multiple occurrences of elements taken from
               S. We define an ordering on all finite multisets over S that
               is induced by the given ordering on S. This multiset ordering
               is shown to be well-founded.
               The value of the multiset ordering is that it permits the use
               of relatively simple and intuitive termination functions in
               otherwise difficult termination proofs. In particular, we
               apply the multiset ordering to provide simple proofs of the
               termination of production systems, programs defined in terms
               of sets of rewriting rules.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/651/CS-TR-78-651.pdf

%R CS-TR-78-652
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Simplification by cooperating decision procedures
%A Nelson, Charles Gregory
%A Oppen, Derek C.
%D April 1978
%X We describe a simplifier for use in program manipulation and
               verification. The simplifier finds a normal form for any
               expression over the language consisting of individual
               variables, the usual boolean connectives, equality, the
               conditional function cond (denoting if-then-else), the
               numerals, the arithmetic functions and predicates +, - and
               $\leq$, the LISP constants, functions and predicates nil,
               car, cdr, cons and atom, the functions store and select for
               storing into and selecting from arrays, and uninterpreted
               function symbols. Individual variables range over the union
               of the reals, the set of arrays, LISP list structure and the
               booleans true and false.
               The simplifier is complete; that is, it simplifies every
               valid formula to true. Thus it is also a decision procedure
               for the quantifier-free theory of reals, arrays and list
               structure under the above functions and predicates.
               The organization of the simplifier is based on a method for
               combining decision procedures for several theories into a
               single decision procedure for a theory combining the original
               theories. More precisely, given a set S of functions and
               predicates over a fixed domain, a satisfiability program for
               S is a program which determines the satisfiability of
               conjunctions of literals (signed atomic formulas) whose
               predicate and function symbols are in S. We give a general
               procedure for combining satisfiability programs for sets S
               and T into a single satisfiability program for S $\cup$ T,
               given certain conditions on S and T.
               The simplifier described in this paper is currently used in
               the Stanford Pascal Verifier.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/652/CS-TR-78-652.pdf

%R CS-TR-78-653
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Multi-terminal 0-1 flow
%A Shiloach, Yossi
%D April 1978
%X Given an undirected 0-1 flow network with n vertices and m
               edges, we present an O($n^2$(m+n)) algorithm which generates
               all ($n\choose 2$) maximal flows between all the pairs of
               vertices. Since O($n^2$(m+n)) is also the size of the output,
               this algorithm is optimal up to a constant factor.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/653/CS-TR-78-653.pdf

%R CS-TR-78-654
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The two paths problem is polynomial
%A Shiloach, Yossi
%D April 1978
%X Given an undirected graph G = (V,E) and vertices
               $s_1$,$t_1$;$s_2$,$t_2$, the problem is to determine whether
               or not G admits two vertex disjoint paths $P_1$ and $P_2$,
               connecting $s_1$ with $t_1$ and $s_2$ with $t_2$
               respectively. This problem is solved by an O($n\cdot m$)
               algorithm (n = |V|, m = |E|). An important by-product of the
               paper is a theorem that states that if G is 4-connected and
               non-planar, then such paths $P_1$ and $P_2$ exist for any
               choice of $s_1$, $s_2$, $t_1$, and $t_2$, (as was conjectured
               by Watkins [1968]).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/654/CS-TR-78-654.pdf

%R CS-TR-78-655
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On accuracy and unconditional stability of linear multistep
               methods for second order differential equations
%A Dahlquist, Germund
%D April 1978
%X Linear multistep methods for the solution of the equation y"
               = f(t,y) are studied by means of the test equation y" =
               -$\omega^2$y, with $\omega$ real. It is shown that the order
               of accuracy cannot exceed 2 for an unconditionally stable
               method.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/655/CS-TR-78-655.pdf

%R CS-TR-78-657
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the model theory of knowledge
%A McCarthy, John
%A Sato, Masahiko
%A Hayashi, Takeshi
%A Igarashi, Shigeru
%D April 1978
%X Another language for expressing "knowing that" is given
               together with axioms and rules of inference and a Kripke type
               semantics. The formalism is extended to time-dependent
               knowledge. Completeness and decidability theorems are given.
               The problem of the wise men with spots on their foreheads and
               the problem of the unfaithful wives are expressed in the
               formalism and solved.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/657/CS-TR-78-657.pdf

%R CS-TR-78-661
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Variations of a pebble game on graphs
%A Gilbert, John R.
%A Tarjan, Robert Endre
%D September 1978
%X We examine two variations of a one-person pebble game played
               on directed graphs, which has been studied as a model of
               register allocation. The black-white pebble game of Cook and
               Sethi is shown to require as many pebbles in the worst case
               as the normal pebble game, to within a constant factor. For
               another version of the pebble game, the problem of deciding
               whether a given number of pebbles is sufficient for a given
               graph is shown to be complete in polynomial space.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/661/CS-TR-78-661.pdf

%R CS-TR-78-662
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T New algorithms in bin packing
%A Yao, Andrew Chi-Chih
%D September 1978
%X In the bin-packing problem a list L of n numbers are to be
               packed into unit-capacity bins. For any algorithm S, let r(S)
               be the maximum ratio S(L)/$L^*$ for large $L^*$, where S(L)
               denotes the number of bins used by S and $L^*$ denotes the
               minimum number needed. In this paper we give an on-line O(n
               log n)-time algorithm RFF with r(RFF) = 5/3, and an off-line
               polynomial-time algorithm RFFD with r(RFFD) =
               (11/9)-$\epsilon$ for some fixed $\epsilon$ > 0. These are
               strictly better respectively than two prominent algorithms --
               the First-Fit (FF) which is on-line with r(FF) = 17/10, and
               the First-Fit-Decreasing (FFD) with r(FFD) = 11/9.
               Furthermore, it is shown that any on-line algorithm S must
               have r(S) $\geq$ 3/2. We also discuss the question "how well
               can an O(n)-time algorithm perform?", showing that, in the
               generalized d-dimensional bin-packing, any O(n)-time
               algorithm S must have r(S) $\geq$ d.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/662/CS-TR-78-662.pdf

%R CS-TR-78-665
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T SCALD: Structured Computer-Aided Logic Design
%A McWilliams, Thomas M.
%A Widdoes, Lawrence C., Jr.
%D March 1978
%X SCALD, a graphics-based hierarchical digital logic design
               system, is described and an example of its use is given.
               SCALD provides a total computer-aided design environment
               which inputs a high-level description of a digital system,
               and produces output for computer-aided manufacture of the
               system. SCALD has been used in the design of an operational,
               15-MIPS, 5500-chip ECL-10k processor.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/665/CS-TR-78-665.pdf

%R CS-TR-78-666
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The SCALD physical design subsystem
%A McWilliams, Thomas M.
%A Widdoes, Lawrence C., Jr.
%D March 1978
%X The SCALD physical design subsystem is described. SCALD
               supports the automatic construction of ECL-10k logic on wire
               wrap cards from the output of a hierarhical design system.
               Results of its use in the design of an operational 15-MIPS
               5500-chip processor are presented and discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/666/CS-TR-78-666.pdf

%R CS-TR-78-668
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T BAOBAB, a parser for a rule-based system using a semantic
               grammar
%A Bonnet, Alain
%D September 1978
%X Until a recent knowledge-based system is able to learn by
               itself, it must acquire new knowledge and new heuristics from
               human experts. This is traditionally done with the aid of a
               computer programmer acting as intermediary. The direct
               transfer of knowledge from an expert to the system requires a
               natural-language processor capable of handling a substantial
               subset of English. The development of such a natural-language
               processor is a long-term goal of automating knowledge
               acquisition; facilitating the interface between the expert
               and the system is a first step toward this goal.
               This paper descrtbes BAOBAB, a program designed and
               implemented for MYCIN (Shortliffe 1974), a medical
               consultation system for infectious disease diagnosis and
               therapy selection. BAOBAB is concerned with the problem of
               parsing - recognizing natural language sentences and encoding
               them into MYClN's internal representation. For this purpose,
               it uses a semantic grammar in which the non-terminal symbols
               denote semantic categories (e.g., infections and symptoms),
               or conceptual categorles whlch are common tools of knowledge
               representation in artificial intelligence (e.g., attributes,
               objects, values and predicate functions). This differs from a
               syntactic grammar in which non-terminal symbols are syntactic
               elements such as nouns or verbs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/668/CS-TR-78-668.pdf

%R CS-TR-78-670
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Information bounds are weak in the shortest distance problem
%A Graham, Ronald L.
%A Yao, Andrew C.
%A Yao, F. Frances
%D September 1978
%X In the all-pair shortest distance problem, one computes the
               matrix D = ($d_{ij}$) where $d_{ij}$ is the minimum weighted
               length of any path from vertex i to vertex j in a directed
               complete graph with a weight on each edge. In all the known
               algorithms, a shortest path $p_{ij}$ achieving $d_{ij}$ is
               also implicitly computed. In fact, $\log_3$ f(n) is an
               information-theoretic lower bound where f(n) is the total
               number of distinct patterns ($p_{ij}$) for n-vertex graphs.
               As f(n) potentially can be as large as $2^{n^3}$, it is
               hopeful that a non-trivial lower bound can be derived this
               way in the decision tree model. We study the characterization
               and enumeration of realizable patterns, and show that f(n)
               $\leq C^{n^2}$. Thus no lower bound greater than C$n^2$ can
               be derived from this approach. We prove as a corollary that
               the Triangular polyhedron $T^{(n)}$, defined in $E^{(n\choose
               2)}$ by $d_{ij} \geq 0$ and the triangle inequalities $d_{ij}
               + d_{jk} \geq d_{ik}$, has at most $C^{n^2}$ faces of all
               dimensions, thus resolving an open question in a similar
               information bound approach to the shortest distance problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/670/CS-TR-78-670.pdf

%R CS-TR-78-673
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A numerical library and its support
%A Chan, Tony F.
%A Coughran, William M., Jr.
%A Grosse, Eric H.
%A Heath, Michael T.
%D November 1978
%X Reflecting on four years of numerical consulting at the
               Stanford Linear Accelerator Center, we point out solved and
               outstanding problems in selecting and installing mathematical
               software, helping users, maintaining the library and
               monitoring its use, and managing the consulting operation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/673/CS-TR-78-673.pdf

%R CS-TR-78-674
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Finite element approximation and iterative solution of a
               class of mildly non-linear elliptic equations
%A Chan, Tony F.
%A Glowinski, Roland
%D November 1978
%X We describe in this report the numerical analysis of a
               particular class of nonlinear Dirichlet problems. We consider
               an equivalent variational inequality formulation on which the
               problems of existence, uniqueness and approximation are
               easier to discuss. We prove in particular the convergence of
               an approximation by piecewise linear finite elements.
               Finally, we describe and compare several iterative methods
               for solving the approximate problems and particularly some
               new algorithms of augmented lagrangian type, which contain as
               special case some well-known alternating direction methods.
               Numerical results are presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/674/CS-TR-78-674.pdf

%R CS-TR-78-678
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reasoning about recursively defined data structures
%A Oppen, Derek C.
%D July 1978
%X A decision algorithm is given for the quantifier-free theory
               of recursively defined data structures which, for a
               conjunction of length n, decides its satisfiability in time
               linear in n. The first-order theory of recursively defined
               data structures, in particular the first-order theory of LISP
               list structure (the theory of CONS, CAR and CDR), is shown to
               be decidable but not elementary recursive.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/678/CS-TR-78-678.pdf

%R CS-TR-78-679
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Steplength algorithms for minimizing a class of
               nondifferentiable functions
%A Murray, Walter
%A Overton, Michael L.
%D November 1978
%X Four steplength algorithms are presented for minimizing a
               class of nondifferentiable functions which includes functions
               arising from $\ell_1$ and $\ell_\infty$ approximation
               problems and penalty functions arising from constrained
               optimization problems. Two algorithms are given for the case
               when derivatives are available wherever they exist and two
               for the case when they are not available. We take the view
               that although a simple steplength algorithm may be all that
               is required to meet convergence criteria for the overall
               algorithm, from the point of view of efficiency it is
               important that the step achieve as large a reduction in the
               function value as possible, given a certain limit on the
               effort to be expended. The algorithms include the facility
               for varying this limit, producing anything from an algorithm
               requiring a single function evaluation to one doing an exact
               linear search. They are based on univariate minimization
               algorithms which we present first. These are normally at
               least quadratically convergent when derivatives are used and
               superlinearly convergent otherwise, regardless of whether or
               not the function is differentiable at the minimum.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/679/CS-TR-78-679.pdf

%R CS-TR-78-680
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Bibliography of Stanford Computer Science reports, 1963-1978
%A Stanley, Connie J.
%D November 1978
%X This report lists, in chronological order, all reports
               published by the Stanford Computer Science Department since
               1963. Each report is identified by Computer Science number,
               author's name, title, National Technical Information Service
               (NTIS) retrieval number, date, and number of pages. Complete
               listings of Theses, Artificial Intelligence Memos, and
               Heuristic Programming Reports are given in the Appendix.
               Also, for the first time, each report has been marked as to
               its availability for ordering and the cost if applicable.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/680/CS-TR-78-680.pdf

%R CS-TR-78-683
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Storing a sparse table
%A Tarjan, Robert Endre
%D December 1978
%X The problem of storing and searching large sparse tables
               arises in compiling and in other areas of computer science.
               The standard technique for storing such tables is hashing,
               but hashing has poor worst-case performance. We consider good
               worst-case methods for storing a table of n entries, each an
               integer between 0 and N-1. For dynamic tables, in which
               look-ups and table additions are intermixed, the use of a
               trie requires O(kn) storage and allows O($\log_k$(N/n))
               worst-case access time, where k is an arbitrary parameter.
               For static tables, in which the entire table is constructed
               before any look-ups are made, we propose a method which
               requires O(n $log^{(\ell)}$ n) storage and allows O($\ell
               \log_n N$) access time, where $\ell$ is an arbitrary
               parameter. Choosing $\ell$ = $\log^* n$ gives a method with
               O(n) storage and O(($\log^* n$)($\log_n N$)) access time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/683/CS-TR-78-683.pdf

%R CS-TR-78-684
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The matrix inverse eigenvalue problem for periodic Jacobi
               matrices
%A Boley, Daniel L.
%A Golub, Gene H.
%D December 1978
%X A stable numerical algorithm is presented for generating a
               periodic Jacobi matrix from two sets of eigenvalues and the
               product of the off-diagonal elements of the matrix. The
               algorithm requires a simple generalization of the Lanczos
               algorithm. It is shown that the matrix is not unique, but the
               algorithm will generate all possible solutions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/684/CS-TR-78-684.pdf

%R CS-TR-78-687
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Prolegomena to a theory of formal reasoning
%A Weyhrauch, Richard W.
%D December 1978
%X This paper is an introduction to the mechanization of a
               theory of reasoning. Currently formal systems are out of
               favor with the AI community. The aim of this paper is to
               explain how formal systems can be used in AI by explaining
               how traditional ideas of logic can be mechanized in a
               practical way. The paper presents several new ideas. Each of
               these is illustrated by giving simple examples of how this
               idea is mechanized in the reasoning system FOL. That is, this
               is not just theory but there is an existing running
               implementation of these ideas.
               In this paper: 1) we show how to mechanize the notion of
               model using the idea of a simulation structure and explain
               why this is particularly important to AI, 2) we show how to
               mechanize the notion of satisfaction, 3) we present a very
               general evaluator for first order expressions, which subsumes
               PROLOG and we propose as a natural way of thinking about
               logic programming, 4) we show how to formalize metatheory, 5)
               we describe reflection principles, which connect theories to
               their metatheories in a way new to AI, 6) we show how these
               ideas can be used to dynamically extend the strength of FOL
               by "implementing" subsidiary deduction rules, and how this in
               turn can be extended to provide a method of describing and
               proving theorems about heuristics for using these rules, 7)
               we discuss one notion of what it could mean for a computer to
               learn and give an example, 8) we describe a new kind of
               formal system that has the property that it can reason about
               its own properties, 9) we give examples of all of the above.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/687/CS-TR-78-687.pdf

%R CS-TR-78-689
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An $n^{log n}$ algorithm for the two-variable-per-constraint
               linear programming satisfiability problem
%A Nelson, Charles Gregory
%D November 1978
%X A simple algorithm is described which determines the
               satisfiability over the reals of a conjunction of linear
               inequalities, none of which contains more than two variables.
               In the worst case the algorithm requires time O(${mn}^{\lceil
               \log^2 n \rceil + 3}$), where n is the number of variables
               and m the number of inequalities. Several considerations
               suggest that the algorithm may be useful in practice: it is
               simple to implement, it is fast for some important special
               cases, and if the inequalities are satisfiable it provides
               valuable information about their so1ution set. The algorithm
               is particularly suited to applications in mechanical program
               verification.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/689/CS-TR-78-689.pdf

%R CS-TR-78-690
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A deductive approach to program synthesis
%A Manna, Z ohar
%A Waldinger, Richard J.
%D November 1978
%X Program synthesis is the systematic derivation of a program
               from a given specification. A deductive approach to program
               synthesis is presented for the construction of recursive
               programs. This approach regards program synthesis as a
               theorem-proving task and relies on a theorem-proving method
               that combines the features of transformation rules,
               unification, and mathematical induction within a single
               framework.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/690/CS-TR-78-690.pdf

%R CS-TR-78-693
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A class of solutions to the gossip problem
%A West, Douglas B.
%D November 1978
%X We characterize and count optimal solutions to the gossip
               problem in which no one hears his own information. That is,
               we consider graphs with n vertices where the edges have a
               linear ordering such that an increasing path exists from each
               vertex to every other, but there is no increasing path from
               any vertex to itself. Such graphs exist only when n is even,
               in which case the fewest number of edges is 2n-4, as in the
               original gossip problem. We characterize optimal solutions of
               this sort (NOHO-graphs) using a correspondence with a set of
               permutations and binary sequences. This correspondence
               enables us to count these solutions and several subclasses of
               solutions. The numbers of solutions in each class are simple
               powers of 2 and 3, with exponents determined by n. We also
               show constructively that NOHO-graphs are planar and
               Hamiltonian, and we mention applications to related problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/693/CS-TR-78-693.pdf

%R CS-TR-78-694
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computer science at Stanford, 1977-1978
%A King, Jonathan J.
%D November 1978
%X This is a review of research and teaching in the Stanford
               Computer Science Department during the 1977-1978 academic
               year.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/694/CS-TR-78-694.pdf

%R CS-TR-78-699
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T SACON: a knowledge-based consultant for structural analysis
%A Bennett, James
%A Creary, Lewis
%A Engelmore, Robert S.
%A Melosh, Robert
%D September 1978
%X In this report we describe an application of artificial
               intelligence (AI) methods to structural analysis. We describe
               the development and (partial) implementation of an "automated
               consultant" to advise non-expert engineers in the use of a
               general-purpose structural analysis program. The analysis
               program numerically simulates the behavior of a physical
               structure subjected to various mechanical loading conditions.
               The automated consultant, called SACON (Structural Analysis
               CONsultant), is based on a version of the MYCIN program
               [Shortliffe, 1974], originally developed to advise physicians
               in the diagnosis and treatment of infectious diseases. The
               domain-specific knowledge in MYCIN is represented as
               situation-action rules, and is kept independent of the
               "inference engine" that uses the rules. By substituting
               structural engineering knowledge for the medical knowledge,
               the program was converted easily from the domain of
               infectious diseases to the domain of structural analysis.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/699/CS-TR-78-699.pdf

%R CS-TR-78-702
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An O($n\cdot I \log^2 I$) maximum-flow algorithm
%A Shiloach, Yossi
%D December 1978
%X We present in this paper a new algorithm to find a maximum
               flow in a flow-network which has n vertices and m edges in
               time of O($n\cdot I \log^2 I$), where I = M+n is the input
               size (up to a constant factor). This result improves the
               previous upper bound of Z . Galil [1978] which was
               O($I^{7/3}$) in the worst case.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/702/CS-TR-78-702.pdf

%R CS-TR-78-663
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Software restyling in graphics and programming languages
%A Grosse, Eric H.
%D September 1978
%X The value of large software products can be cheaply increased
               by adding restyled interfaces that attract new users. As
               examples of this approach, a set of graphics primitives and a
               language precompiler for scientific computation are
               described. These two systems include a general user-defined
               coordinate system instead of numerous system settings,
               indention to specify block structure, a modified indexing
               convention for array parameters, a syntax for
               n-and-a-half-times-'round loops, and engineering format for
               real constants; most of all, they strive to be as small as
               possible.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/663/CS-TR-78-663.pdf

%R CS-TR-78-697
%Z Thu, 22 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the linear least squares problem with a quadratic
               constraint
%A Gander, Walter
%D November 1978
%X In this paper we present the theory and practical
               computational aspects of the linear least squares problem
               with a quadratic constraint. New theorems characterizing
               properties of the solutions are given and extended for the
               problem of minimizing a general quadratic function subject to
               a quadratic constraint. For two important regularization
               methods we formulate dual equations which proved to be very
               useful for the applications of smoothing of datas. The
               resulting algorithm is a numerically stable version of an
               algorithm proposed by Rutishauser. We show also how to choose
               a third order iteration method to solve the secular
               equations. However we are still far away from a foolproof
               machine independent algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/78/697/CS-TR-78-697.pdf

%R CS-TR-79-703
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A polynomial time algorithm for solving systems of linear
               inequalities with two variables per inequality
%A Aspvall, Bengt
%A Shiloach, Yossi
%D January 1979
%X We present a constructive algorithm for solving systems of
               linear inequalities (LI) with at most two variables per
               inequality. The algorithm is polynomial in the size of the
               input. The LI problem is of importance in complexity theory
               since it is polynomial time equivalent to linear programming.
               The subclass of LI treated in this paper is also of practical
               interest in mechanical verification systems, and we believe
               that the ideas presented can be extended to the general LI
               problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/703/CS-TR-79-703.pdf

%R CS-TR-79-704
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A survey of the state of software for partial differential
               equations
%A Sweet, Roland A.
%D January 1979
%X This paper surveys the state of general purpose software for
               the solution of partial differential equations. A discussion
               of the purported capabilities of twenty-one programs is
               presented. No testing of the routines was performed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/704/CS-TR-79-704.pdf

%R CS-TR-79-706
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Graph 2-isomorphism is NP-complete
%A Yao, F. Francis
%D January 1979
%X Two graphs G and G' are said to be k-isomorphic if their edge
               sets can be partitioned into E(G) = $E_1 \cup E_2 \cup ...
               \cup E_k$ and E(G') = ${E'}_1 \cup {E'}_2 \cup ... \cup
               {E'}_k$ such that as graphs, $E_i$ amd ${E'}_i$ are
               isomorphic for $1 \leq i \leq k$. In this note we show that
               it is NP-complete to decide whether two graphs are
               2-isomorphic.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/706/CS-TR-79-706.pdf

%R CS-TR-79-707
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming and problem-solving seminar
%A Van Wyk, Christopher J.
%A Knuth, Donald E.
%D January 1979
%X This report contains edited transcripts of the discussions
               held in Stanford's course CS 204, Problem Seminar, during
               autumn quarter 1978. Since the topics span a large range of
               ideas in computer science, and since most of the important
               research paradigms and programming paradigms came up during
               the discussions, these notes may be of interest to graduate
               students of computer science at other universities, as well
               as to their professors and to professional people in the
               "real world."
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/707/CS-TR-79-707.pdf

%R CS-TR-79-708
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An analysis of a memory allocation scheme for implementing
               stacks
%A Yao, Andrew C.
%D January 1979
%X Consider the implementation of two stacks by letting them
               grow towards each other in a table of size m . Suppose a
               random sequence of insertions and deletions are executed,
               with each instruction having a fixed probability p (0 < p <
               1/2) to be a deletion. Let $A_p (m) denote the expected value
               of max{x,y}, where x and y are the stack heights when the
               table first becomes full. We shall prove that, as $m
               \rightarrow \infty$, $A_p (m) = \sqrt{m/(2 \pi (1-2p))} +
               O((log m)/ \sqrt{m})$. This gives a solution to an open
               problem in Knuth ["The Art of Computer Programming, Vol. 1,
               Exercise 2.2.2-13].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/708/CS-TR-79-708.pdf

%R CS-TR-79-710
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Numerical computation of the Schwarz-Christoffel
               transformation
%A Trefethen, Lloyd N.
%D March 1979
%X A program is described which computes Schwarz-Christoffel
               transformations that map the unit disk conformally onto the
               interior of a bounded or unbouded polygon in the complex
               plane. The inverse map is also computed. The computational
               problem is approached by setting up a nonlinear system of
               equations whose unknowns are essentially the "accessory
               parameters" $z_k$. This system is then solved with a packaged
               subroutine.
               New features of this work include the evaluation of integrals
               within the disk rather than along the boundary, making
               possible the treatment of unbounded polygons; the use of a
               compound form of Gauss-Jacobi quadrature to evaluate the
               Schwarz-Christoffel integral, making possible high accuracy
               at reasonable cost; and the elimination of constraints in the
               nonlinear system by a simple change of variables.
               Schwarz-Christoffel transformations may be applied to solve
               the Laplace and Poisson equations and related problems in
               two-dimensional domains with irregular or unbounded (but not
               curved or multiply connected) geometries. Computational
               examples are presented. The time required to solve the
               mapping problem is roughly proportional to $N^3$, where N is
               the number of vertices of the polygon. A typical set of
               computations to 8-place accuracy with $N \leq 10$ takes 1 to
               10 seconds on an IBM 370/168.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/710/CS-TR-79-710.pdf

%R CS-TR-79-712
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The errata of computer programming
%A Knuth, Donald E.
%D January 1979
%X This report lists all corrections and changes of Volumes 1
               and 3 of "The Art of Computer Programming," as of January 5,
               1979. This updates the previous list in report CS551, May
               1976. The second edition of Volume 2 has been delayed two
               years due to the fact that it was completely revised and put
               into the TEX typesetting language; since publication of this
               new edition is not far off, no changes to Volume 2 are listed
               here.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/712/CS-TR-79-712.pdf

%R CS-TR-79-714
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T PCFORT: a Fortran-to-Pcode translator
%A Castaneda, Fernando
%A Chow, Frederick C.
%A Nye, Peter
%A Sleator, Daniel D.
%A Wiederhold, Gio
%D January 1979
%X PCFORT is a compiler for the FORTRAN language designed to fit
               as a building block into a PASCAL oriented environment. It
               forms part of the programming systems being developed for the
               S-1 multiprocessor. It is written in PASCAL, and generates
               P-code, an intermediate language used by transportable PASCAL
               compilers to represent the program in a simple form. P-code
               is either compiled or interpreter depending upon the
               objectives of the programming system.
               A PASCAL written FORTRAN compiler provides a bridge between
               the FORTRAN and PASCAL communities. The implementation allows
               PASCAL and FORTRAN generated code to be combined into one
               program. The FORTRAN language supported here is FORTRAN to
               the full 1966 standard, extended with those features commonly
               expected by available large scientific programs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/714/CS-TR-79-714.pdf

%R CS-TR-79-715
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T S-1 architecture manual
%A Hailpern, Brent T.
%A Hitson, Bruce L.
%D January 1979
%X This manual provides a complete description of the
               instruction-set architecture of the S-1 Uniprocessor (Mark
               IIA), exclusive of vector operations. It is assumed that the
               reader has a general knowledge of computer architecture. The
               manual was designed to be both a detailed introduction to the
               S-1 and an architecture reference manual. Also included are
               user manuals for the FASM Assembler and the S-1 Formal
               Description Syntax.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/715/CS-TR-79-715.pdf

%R CS-TR-79-716
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A framework for control in production systems
%A Georgeff, Michael P.
%D January 1979
%X A formal model for representing control in production systems
               is defined. The formalism allows control to be directly
               specified independently of the conflict resolution scheme,
               and thus allows the issues of control and nondeterminism to
               be treated separately. Unlike previous approaches, it allows
               control to be examined within a uniform and consistent
               framework.
               It is shown that the formalism provides a basis for
               implementing control constructs which, unlike existing
               schemes, retain all the properties desired of a knowledge
               based system --- modularity, flexibility, extensibility and
               explanatory capacity. Most importantly, it is shown that
               these properties are not a function of the lack of control
               constraints, but of the type of information allowed to
               establish these constraints.
               Within the formalism it is also possible to provide a
               meaningful notion of the power of control constructs. This
               enables the types of control required in production systems
               to be examined and the capacity of various schemes to meet
               these requirements to be determined.
               Schemes for improving system efficiency and resolving
               nondeterminism are examined, and devices for representing
               such meta-level knowledge are described. In particular, the
               objectification of control information is shown to provide a
               better paradigm for problem solving and for talking about
               problem solving. It is also shown that the notion of control
               provides a basis for a theory of transformation of production
               systems, and that this provides a uniform and consistent
               approach to problems involving subgoal protection.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/716/CS-TR-79-716.pdf

%R CS-TR-79-718
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T AL users' manual
%A Mujtaba, Mohamed Shahid
%A Goldman, Ron
%D January 1979
%X This document describes the current state of the AL system
               now in operation at the Stanford Artificial Intelligence
               Laboratory, and teaches the reader how to use it. The system
               consists of AL, a high-level programming language for
               manipulator control useful in industrial assembly research;
               POINTY, an interactive system for specifying representation
               of parts; and ALAID, an interactive debugger for AL.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/718/CS-TR-79-718.pdf

%R CS-TR-79-719
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Extrapolation of asymptotic expansions by a modified Aitken
               $\delta^2$-formula
%A Bjorstad, Petter
%A Dahlquist, Germund
%A Grosse, Eric H.
%D March 1979
%X A modified Aitken formula permits iterated extrapolations to
               efficiently estimate $s_\infty$ from $s_n$ when an asymptotic
               expansion $s_n = s_\infty + n^{-k} (c_0 + c_1 n^{-1} + c_2
               n^{-2} + ... )$ holds for some (unknown) coefficients $c_j$.
               We study the truncation and irregular error and compare the
               method with other forms of extrapolation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/719/CS-TR-79-719.pdf

%R CS-TR-79-720
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On grid optimization for boundary value problems
%A Glowinski, Roland
%D February 1979
%X We discuss in this report the numerical procedures which can
               be used to obtain the optimal grid when solving by a finite
               element method a model boundary value problem of elliptic
               type modelling the potential flow of an incompressible
               inviscid fluid. Results of numerical experiments are
               presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/720/CS-TR-79-720.pdf

%R CS-TR-79-721
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On fault-tolerant networks for sorting
%A Yao, Andrew C.
%A Yao, F. Frances
%D February 1979
%X The study of constructing reliable systems from unreliable
               components goes back to the work of von Neumann, and of Moore
               and Shannon. The present paper studies the use of redundancy
               to enhance reliability for sorting and related networks built
               from unreliable comparators. Two models of fault-tolerant
               networks are discussed. The first model patterns after the
               concept of error-correcting codes in information theory, and
               the other follows the stochastic criterion used by von
               Neumann and Moore-Shannon. It is shown, for example, that an
               additional k(2n-3) comparators are sufficient to render a
               sorting network reliable, provided that no more than k of its
               comparators may be faulty.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/721/CS-TR-79-721.pdf

%R CS-TR-79-722
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A structural model for database systems
%A Wiederhold, Gio
%A El-Masri, Ramez A.
%D February 1979
%X This report presents a model to be used for database design.
               Because our motivation extends to providing guidance for the
               structured implementation of a database, we call our model
               the 'Structural Model.' We derive the design using criteria
               of correctness, relevance, and performance from semantic and
               operational specifications obtained from multiple sources.
               These sources typically correspond to prospective users or
               user groups of the database. The integration of such
               specifications is a central issue in the development of an
               integrated structural database model.
               The structural model is used for the design of the logical
               structures that represent a real-world situation. However, it
               is not meant to represent all possible real-world semantics,
               but a subset of the semantics which are important in database
               modelling.
               The model uses relations as building blocks, and hence can be
               considered as an extension of Codd's relational model [1970].
               The main extensions to the relational model are the explicit
               representation of logical connections between relations, the
               inclusion of insertion-deletion constraints in the model
               itself, and the separation of relations into several
               structural types.
               Connections between relations are used to represent existence
               dependencies of tuples in different relations. These
               existence dependencies are important for the definition of
               semantics of relationships between classes of real-world
               entities. The connections between relations are used to
               specify these existence dependencies, and to ensure that they
               remain valid when the database is updated. Hence, connections
               implicitly define a basic, limited set of integrity
               constraints on the database, those that identify and maintain
               existence dependencies among tuples from different relations.
               Consequently, the rules for the maintenance of the structural
               integrity of the model under insertion and deletion of tuples
               are easy to specify.
               Structural relation types are used to specify how each
               relation may be connected to other relations in the model.
               Relations are classified into five types: primary relations,
               referenced relations, nest relations, association relations,
               and lexicon relations. The motivation behind the choice of
               these relation types is discussed, as is their use in data
               model design.
               A methodology for combining multiple, overlapping data models
               - also called user views in the literature - is associated
               with the structural model. The database model, or conceptual
               schema, which represents the integrated database, may thus be
               derived from the individual data models of the users. We
               believe that the structural model can be used to represent
               the data relationships within the conceptual schema of the
               ANSI/SPARC DBMS model since it can support database
               submodels, also called external schema, and maintain the
               integrity of the submodels with respect to the integrity
               constraints expressable in the structural model.
               We then briefly discuss the use of the structural model in
               database design and implementation. The structural model
               provides a tool to deal effectively with the complexityu of
               large, real-world databases.
               We begin this report with a very short review of existing
               database models. In Chapter 2, we state the purpose of the
               model, and in Chapter 3 we describe the structural model,
               first informally and then using a formal framework based on
               extensions of the relational model. Chapter 4 defines the
               representations we use, and Chapter 5 covers the integration
               of data models that represent the different user
               specifications into an integrated database model. Formal
               descriptions and examples of the prevalent cases are given.
               The work is then placed into context first relative to other
               work (Chapter 6) and then briefly within our methodology for
               database design (Chapter 7).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/722/CS-TR-79-722.pdf

%R CS-TR-79-726
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An analysis of (h,k,l)-shellsort
%A Yao, Andrew Chi-Chih
%D March 1979
%X One classical sorting algorithm, whose performance in many
               cases remains unanalyzed, is Shellsort. Let $\vec{h} be a
               t-component vector of positive integers. An
               $\vec{h}$-Shellsort will sort any given n elements in t
               passes, by means of comparisons and exchanges of elements.
               Let $S_j$($\vec{h}$;n) denote the average number of element
               exchanges in the j-th pass, assuming that all the n! initial
               orderings are equally likely. In this paper we derive
               asymptotic formulas of $S_j$($\vec{h}$;n) for any fixed
               $\vec{h}$ = (h,k,l), making use of a new combinatorial
               interpretation of $S_3$. For the special case $\vec{h}$ =
               (3,2,1), the analysis if further sharpened to yield exact
               expressions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/726/CS-TR-79-726.pdf

%R CS-TR-79-728
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Union-member algorithms for non-disjoint sets
%A Shiloach, Yossi
%D January 1979
%X In this paper we deal with the following problem. We are
               given a finite set U = {$u_1$,...,$u_n$} and a set [cursive
               capital 'S'] = {$S_1$,...,$S_m$} of subsets of U. We are also
               given m-1 UNION instructions that have the form
               UNION($S_i$,$S_j$) and mean "add the set $S_i \cup S_j$ to
               the collection and delete $S_i$ and $S_j$." Interspaced among
               the UNIONs are MEMBER(i,j) questions that mean "does $u_i$
               belong to $S_j$?"
               We present two algorithms that exhibit the trade-off among
               the three interesting parameters of this problem, which are:
               1. Time required to answer one membership question.
               2. Time required to perform the m-1 UNIONs altogether.
               3. Space.
               We also give an application of these algorithms to the
               problem of 5-coloring of planar graphs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/728/CS-TR-79-728.pdf

%R CS-TR-79-729
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A unified approach to path problems
%A Tarjan, Robert Endre
%D April 1979
%X We describe a general method for solving path problems on
               directed graphs. Such path problems include finding shortest
               paths, solving sparse systems of linear equations, and
               carrying out global flow analysis of computer programs. Our
               method consists of two steps. First, we construct a
               collection of regular expressions representing sets of paths
               in the graph. This can be done by using any standard
               algorithm, such as Gaussian or Gauss-Jordan elimination.
               Next, we apply a natural mapping from regular expressions
               into the given problem domain. We exhibit the mappings
               required to find shortest paths, solve sparse systems of
               linear equations, and carry out global flow analysis. Our
               results provide a general-purpose algorithm for solving any
               path problem, and show that the problem of constructing path
               expressions is in some sense the most general path problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/729/CS-TR-79-729.pdf

%R CS-TR-79-730
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Qualifying examinations in computer science, 1965-1978
%A Liang, Frank M.
%D April 1979
%X Since 1965, the Stanford Computer Science Department has
               periodically given "qualifying examinations" as one of the
               requirements of its graduate program. These examinations are
               given in each of six subareas of computer science:
               Programming Languages and Systems, Artificial Intelligence,
               Numerical Analysis, Computer Design, Theory of Computation,
               and Analysis of Algorithms. This report presents the
               questions from these examinations, and also the associated
               reading lists.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/730/CS-TR-79-730.pdf

%R CS-TR-79-731
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stanford Pascal Verifier user manual
%A Luckham, David C.
%A German, Steven M.
%A von Henke, Friedrich W.
%A Karp, Richard A.
%A Milne, P. W.
%A Oppen, Derek C.
%A Polak, Wolfgang
%A Scherlis, William L.
%D March 1979
%X The Stanford PASCAL verifier is an interactive program
               verification system. It automates much of the work necessary
               to analyze a program for consistency with its documentation,
               and to give a rigorous mathematical proof of such consistency
               or to pin-point areas of inconsistency. It has been shown to
               have applications as an aid to programming, and to have
               potential for development as a new and useful tool in the
               production of reliable software.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/731/CS-TR-79-731.pdf

%R CS-TR-79-732
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Notes on introductory combinatorics
%A Woods, Donald R.
%D April 1979
%X In the spring of 1978, Professors George Polya and Robert
               Tarjan teamed up to teach CS 150 - Introduction to
               Combinatorics. This report consists primarily of the class
               notes and other handouts produced by the author as teaching
               assistant for the course.
               Among the topics covered are elementary subjects such as
               combinations and permutations, mathematical tools such as
               generating functions and Polya's Theory of Counting, and
               analyses of specific problems such as Ramsey Theory,
               matchings, and Hamiltonian and Eulerian paths.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/732/CS-TR-79-732.pdf

%R CS-TR-79-733
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A lower bound to finding convex hulls
%A Yao, Andrew Chi-Chih
%D April 1979
%X Given a set S of n distinct points {($x_i$,$y_i$) | 0 $\leq$
               i < n}, the convex hull problem is to determine the vertices
               of the convex hull H(S). All the known algorithms for solving
               this problem have a worst-case running time of cn log n or
               higher, and employ only quadratic tests, i.e., tests of the
               form f($x_0$, $y_0$, $x_1$, $y_1$,...,$x_{n-1}$, $y_{n-1}$):
               0 with f being any polynomial of degree not exceeding 2. In
               this paper, we show that any algorithm in the quadratic
               decision-tree model must make cn log n tests for some input.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/733/CS-TR-79-733.pdf

%R CS-TR-79-734
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast algorithms for solving path problems
%A Tarjan, Robert Endre
%D April 1979
%X Let G = (V,E) be a directed graph with a distinguished source
               vertex s. The single-source path expression problem is to
               find, for each vertex v, a regular expression P(s,v) which
               represents the set of all paths in G from s to v. A solution
               to this problem can be used to solve shortest path problems,
               solve sparse systems of linear equations, and carry out
               global flow analysis. We describe a method to compute path
               expressions by dividing G into components, computing path
               expressions on the components by Gaussian elimination, and
               combining the solutions. This method requires O(m
               $\alpha$(m,n)) time on a reducible flow graph, where n is the
               number of vertices in G, m is the number of edges in G, and
               $\alpha$ is a functional inverse of Ackermann's function. The
               method makes use of an algorithm for evaluating functions
               defined on paths in trees. A simplified version of the
               algorithm, which runs in O(m log n) time on reducible flow
               graphs, is quite easy to implement and efficient in practice.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/734/CS-TR-79-734.pdf

%R CS-TR-79-735
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Kronecker's canonical form and the QZ  algorithm
%A Wilkinson, James Hardy
%D April 1979
%X In the QZ  algorithm the eigenvalues of Ax = $\lambda$Bx are
               computed via a reduction to the form $\tilde{A}$x = $\lambda
               \tilde{B}$x where $\tilde{A}$ and $\tilde{B}$ are upper
               triangular. The eigenvalues are given by ${\lambda}_i$ =
               $a_{ii}$/$b_{ii}$. It is shown that when the pencil
               $\tilde{A}$ - $\lambda \tilde{B}$ is singular or nearly
               singular a value of ${\lambda}_i$ may have no significance
               even when $\tilde{a}_{ii}$ and $\tilde{b}_{ii}$ are of full
               size.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/735/CS-TR-79-735.pdf

%R CS-TR-79-736
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Note on the practical significance of the Drazin inverse
%A Wilkinson, James Hardy
%D April 1979
%X The solution of the differential system Bx = Ax + f where A
               and B are n x n matrices and A - $\lambda$B is not a singular
               pencil may be expressed in terms of the Drazin inverse. It is
               shown that there is a simple reduced form for the pencil A -
               $\lambda$B which is adequate for the determination of the
               general solution and that although the Drazin inverse could
               be determined efficiently from this reduced form it is
               inadvisable to do so.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/736/CS-TR-79-736.pdf

%R CS-TR-79-737
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the average-case complexity of selecting the k-th best
%A Yao, Andrew C.
%A Yao, F. Frances
%D April 1979
%X Let ${\bar{V}}_k$(n) be the minimum average number of
               pairwise comparisons needed to find the k-th largest of n
               numbers (k $\leq$ 2), assuming that all n! orderings are
               equally likely. D. W. Matula proved that, for some absolute
               constant c, ${\bar{V}}_k$(n)-n $\leq$ c k log log n as n
               $\rightarrow \infty$. In the present paper, we show that
               there exists an absolute constant c' > 0 such that
               ${\bar{V}}_k$(n)-n $\leq$ c' k log log n as n $\rightarrow
               \infty$, proving a conjecture of Matula.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/737/CS-TR-79-737.pdf

%R CS-TR-79-738
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computations related to G-stability of linear multistep
               methods
%A LeVeque, Randall J.
%A Dahlquist, Germund
%A Andree, Dan
%D May 1979
%X In Dahlquist's recent proof of the equivalence of A-stability
               and G-stability, an algorithm was presented for calculating a
               G-stability matrix for any A-stable linear multistep method.
               Such matrices, and various quantities computable from them,
               are useful in many aspects of the study of the stability of a
               given method. For example, information may be gained as to
               the shape of the stability region, or the rate of growth of
               unstable solutions. We present a summary of the relevant
               theory and the results of some numerical calculations
               performed for several backward differentiation,
               Adams-Bashforth, and Adams-Moulton methods of low order.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/738/CS-TR-79-738.pdf

%R CS-TR-79-739
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Induction over large data bases
%A Quinlan, J. R.
%D May 1979
%X Techniques for discovering rules by induction from large
               collections of instances are developed. These are based on an
               iterative scheme for dividing the instances into two sets,
               only one of which needs to be randomly accessible. These
               techniques have made it possible to discover complex rules
               from data bases containing many thousands of instances.
               Results of several experiments using them are reported.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/739/CS-TR-79-739.pdf

%R CS-TR-79-740
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The logic of aliasing
%A Cartwright, Robert
%A Oppen, Derek C.
%D September 1979
%X We give a new version of Hoare's logic which correctly
               handles programs with aliased variables. The central proof
               rules of the logic (procedure call and assignment) are proved
               sound and complete.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/740/CS-TR-79-740.pdf

%R CS-TR-79-748
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast algorithms for solving Toeplitz systems of equations and
               finding rational Hermite interpolants
%A Yun, David Y. Y.
%D July 1979
%X We present a new algorithm that reduces the computation for
               solving a Toeplitz system to O(n ${log}^2$ n) and
               automatically resolves all degenerate cases of the past. Our
               fundamental results show that all rational Hermite
               interpolants, including Pade approximants which is intimately
               related to this solution process, can be computed fast by an
               Euclidean algorithm. In this report we bring out all these
               relationships with mathematical justifications and mention
               important applications including decoding BCH codes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/748/CS-TR-79-748.pdf

%R CS-TR-79-753
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Should tables by sorted?
%A Yao, Andrew Chi-Chih
%D July 1979
%X We examine optimality questions in the following information
               retrieval problem: Given a set S of n keys, store them so
               that queries of the form "Is x $\in$ S?" can be answered
               quickly. It is shown that, in a rather general model
               including al1 the commonly-used schemes, $\lceil$ lg(n+l)
               $\rceil$ probes to the table are needed in the worst case,
               provided the key space is sufficiently large. The effects of
               smaller key space and arbitrary encoding are also explored.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/753/CS-TR-79-753.pdf

%R CS-TR-79-759
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Schema-shift strategies to understanding structured texts in
               natural language
%A Bonnet, Alain
%D August 1979
%X This report presents BAOBAB-2, a computer program built upon
               MYCIN [Shortliffe, 1974] that is used for understanding
               medical summaries describing the status of patients. Due both
               to the conventlonal way physicians present medical problems
               in these summaries and the constrained nature of medical
               jargon, these texts have a very strong structure. BAOBAB-2
               takes advantage of this structure by using a model of this
               organization as a set of related schemas that facilitate the
               interpretatlon of these texts. Structures of the schemas and
               their relatlon to the surface structure are described. Issues
               relating to selection and use of these schemas by the program
               during interpretation of the summaries are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/759/CS-TR-79-759.pdf

%R CS-TR-79-760
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Some monotonicity properties of partial orders
%A Graham, Ronald L.
%A Yao, Andrew C.
%A Yao, F. Frances
%D September 1979
%X A fundamental quantity which arises in the sorting of n
               numbers $a_1$, $a_2$,..., $a_n$ is Pr($a_i$ < $a_j$ | P), the
               probability that $a_i$ < $a_j$ assuming that all linear
               extensions of the partial order P are equally likely. In this
               paper we establish various properties of Pr($a_i$ < $a_j$ |
               P) and related quantities. In particular, it is shown that
               Pr($a_i$ < $b_j$ | P') $\geq$ Pr($a_i$ < $b_j$ | P), if the
               partial order P consists of two disjoint linearly ordered
               sets A = {$a_1$ < $a_2$ < ... < $a_m$}, B = {$b_1$ < $b_2$ <
               ... < $b_n} and P' = P $\cup$ {any relations of the form
               $a_k$ < $b_l$}. These inequalities have applications in
               determining the complexity of certain sorting-like
               computations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/760/CS-TR-79-760.pdf

%R CS-TR-79-761
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Gossiping without duplicate transmissions
%A West, Douglas B.
%D August 1979
%X n people have distinct bits of information, which they
               communicate via telephone calls in which they transmit
               everything they know. We require that no one ever hear the
               same piece of information twice. In the case 4 divides n, n
               $\geq$ 8, we provide a construction that transmits all
               information using only 9n/4-6 calls. Previous constructions
               used 1/2 n log n calls.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/761/CS-TR-79-761.pdf

%R CS-TR-79-762
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T METAFONT: a system for alphabet design
%A Knuth, Donald E.
%D September 1979
%X This is the user's manual for METAFONT, a companion to the
               TEX tyesetting system. The system makes it fairly easy to
               define high quality fonts of type in a machine-independent
               manner; a user writes "programs" in a new language developed
               for this purpose. By varying parameters of a design, an
               unlimited number of typefaces can be obtained from a single
               set of programs. The manual also sketches the algorithms used
               by the system to draw the character shapes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/762/CS-TR-79-762.pdf

%R CS-TR-79-763
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A symmetric chain decomposition of L(4,n)
%A West, Douglas B.
%D August 1979
%X L(m,n) is the set of integer m-tuples ($a_1$,...,$a_m$) with
               $0\leq a_1 \leq ...\leq a_m \leq n$, ordered by
               $\underline{a} \leq \underline{b}$ when $a_i\leq b_i$ for all
               i. R. Stanley conjectured that L(m,n) is a symmetric chain
               order for all (m,n). We verify this by construction for m =
               4.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/763/CS-TR-79-763.pdf

%R CS-TR-79-764
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the time-space tradeoff for sorting with linear queries
%A Yao, Andrew Chi-Chih
%D August 1979
%X Extending a result of Borodin, et al., we show that any
               branching program using linear queries " $\sum_{i}
               {\lambda}_i {x_i}: c$ " to sort n numbers
               $x_1$,$x_2$,...,$x_n$ must satisfy the time-space tradeoff
               relation TS = $\Omega (n_2)$. The same relation is also shown
               to be true for branching programs that use queries " min R =
               ? " where R is any subset of {$x_1$,$x_2$,...,$x_n$}.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/764/CS-TR-79-764.pdf

%R CS-TR-79-765
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Relation between the complexity and the probability of large
               numbers
%A Gacs, Peter
%D September 1979
%X H(x), the negative logarithm of the apriori probability M(x),
               is Levin's variant of Kolmogorov's complexity of a natural
               number x. Let $\alpha (n)$ be the minimum complexity of a
               number larger than n, s(n) the logarithm of the apriori
               probability of obtaining a number larger than n . It was
               known that $s(n) \leq\ \alpha (n) \leq\ s(n) + H(\lceil s(n)
               \rceil )$. We show that the second estimate is in some sense
               sharp.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/765/CS-TR-79-765.pdf

%R CS-TR-79-767
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On Stewart's singular value decomposition for partitioned
               orthogonal matrices
%A Van Loan, Charles
%D September 1979
%X A variant of the singular value decomposition for orthogonal
               matrices due to G. W. Stewart is discussed. It is shown to be
               useful in the analysis of (a) the total least squares
               problem, (b) the Golub-Klema-Stewart subset selection
               algorithm, and (c) the algebraic Riccati equation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/767/CS-TR-79-767.pdf

%R CS-TR-79-770
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Pretty printing
%A Oppen, Derek C.
%D October 1979
%X An algorithm for pretty printing is given. For an input
               stream of length n and an output device with margin width m,
               the algorithm requires time O(n) and space O(m). The
               algorithrn is described in terms of two parallel processes;
               the first scans the input stream to determine the space
               required to print logical blocks of tokens; the second uses
               this information to decide where to break lines of text; the
               two processes communicate by means of a buffer of size O(m).
               The algorithm does not wait for the entire stream to be
               input, but begins printing as soon as it has received a
               linefull of input. The algorithm is easily implemented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/770/CS-TR-79-770.pdf

%R CS-TR-79-773
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Updating formulae and a pairwise algorithm for computing
               sample variances
%A Chan, Tony F.
%A Golub, Gene H.
%A LeVeque, Randall J.
%D November 1979
%X A general formula is presented for computing the simple
               variance for a sample of size m + n given the means and
               variances for two subsamples of sizes m and n. This formula
               is used in the construction of a pairwise algorithm for
               computing the variance. Other applications are discussed as
               well, including the use of updating formulae in a parallel
               computing environnment. We present numerical results and
               rounding error analyses for several numerical schemes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf

%R CS-TR-79-774
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Large scale geodetic least squares adjustment by dissection
               and orthogonal decomposition
%A Golub, Gene H.
%A Plemmons, Robert J.
%D November 1979
%X Very large scale matrix problems currently arise in the
               context of accurately computing the coordinates of points on
               the surface of the earth. Here geodesists adjust the
               approximate values of these coordinates by computing least
               squares solutions to large sparse systems of equations which
               result from relating the coordinates to certain observations
               such as distances or angles between points. The purpose of
               this paper is to suggest an alternative to the formation and
               solution of the normal equations for these least squares
               adjustment problems. In particular, it is shown how a
               block-orthogonal decomposition method can be used in
               conjunction with a nested dissection scheme to produce an
               algorithm for solving such problems which combines efficient
               data management with numerical stability. As an indication of
               the magnitude that these least squares adjustment problems
               can sometimes attain, the forthcoming readjustment of the
               North American Datum in 1983 by the National Geodetic Survey
               is discussed. Here it becomes necessary to linearize and
               solve an overdetermined system of approximately 6,000,000
               equations in 400,000 unknowns - a truly large-scale matrix
               problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/774/CS-TR-79-774.pdf

%R CS-TR-79-775
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The analysis of sequential experiments with feedback to
               subjects
%A Diaconis, Persi
%A Graham, Ronald L.
%D November 1979
%X A problem arising in taste testing, medical, and
               parapsychology experiments can be modeled as follows. A deck
               of n cards contains $c_i$ cards labeled i, $1 \leq i \leq r$.
               A subject guesses at the cards sequentially. After each guess
               the subject is told the card just guessed (or at least if the
               guess was correct or not). We determine the optimal and worst
               case strategies for subjects and the distribution of the
               number of correct guesses under these strategies. We show how
               to use skill scoring to evaluate such experiments in a way
               which (asymptotically) does not depend on the subject's
               strategy.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/775/CS-TR-79-775.pdf

%R CS-TR-79-777
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On constant weight codes and harmonious graphs
%A Graham, Ronald L.
%A Sloane, Neil J. A.
%D December 1979
%X Very recently a new method has been developed for finding
               lower bounds on the maximum number of codewords possible in a
               code of minimum distance d and length n. This method has led
               in turn to a number of interesting questions in graph theory
               and additive number theory. In this brief survey we summarize
               some of these developments.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/777/CS-TR-79-777.pdf

%R CS-TR-79-778
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A hierarchical associative architecture for the parallel
               evaluation of relational algebraic database primitives
%A Shaw, David Elliot
%D October 1979
%X Algorithms are described and analyzed for the efficient
               evaluation of the primitive operators of a relational algebra
               on a proposed non-von Neumann machine based on a hierarchy of
               associative storage devices. This architecture permits an
               O(log n) decrease in time complexity over the best known
               evaluation methods on a conventional computer system, without
               the use of redundant storage, and using currently available
               and potentially competitive technology. In many eases of
               practical import, the proposed architecture may also permit a
               significant improvement (by a factor roughly proportional to
               the capacity of the primary associative storage device) over
               the performance of previously implemented or proposed
               database machine architectures based on associative secondary
               storage devices.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/778/CS-TR-79-778.pdf

%R CS-TR-79-781
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Exploring the use of domain knowledge for query processing
               efficiency
%A King, Jonathan J.
%D December 1979
%X An approach to query optimization is described that draws on
               two sources of knowledge: real world constraints on the
               values for the application domain served by the database; and
               knowledge about the current structure of the database and the
               cost of available retrieval processes. Real world knowledge
               is embodied in rules that are much like semantic integrity
               rules. The approach, called "query rephrasing", is to
               generate semantic equivalents of user queries that cost less
               to process than the original queries. The operation of a
               prototype system based on this approach is discussed in the
               context of simple queries which restrict a single file. The
               need for heuristics to limit the generation of equivalent
               queries is also discussed, and a method using "constraint
               thresholds" derived from a model of the retrieval process is
               proposed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/781/CS-TR-79-781.pdf

%R CS-TR-79-816
%Z Mon, 19 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automating the study of clinical hypotheses on a
               time-oriented data base: the RX Project
%A Blum, Robert L.
%D November 1979
%X The existence of large chronic disease data bases offers the
               possibility of studying hypotheses of major medical
               importance. An objective of the RX Project is to assist a
               clinical researcher with the tasks of experimental design and
               statistical analysis. A major component of RX is a knowledge
               base of medicine and statistics, organized as a frame-based,
               taxonomic tree. RX determines confounding variables, study
               design, and analytic techniques. It then gathers data,
               analyzes it, and interprets results. The American Rheumatism
               Association Medical Information System is used.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/816/CS-TR-79-816.pdf

%R CS-TR-77-588
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On computing the singular value decomposition
%A Chan, Tony Fan C.
%D February 1977
%X The most well-known and widely-used algorithm for computing
               the Singular Value Decomposition (SVD) of an m x n
               rectangular matrix A nowadays is the Golub-Reinsch algorithm
               [1971]. In this paper, it is shown that by (1) first
               triangularizing the matrix A by Householder transformations
               before bidiagonalizing it, and (2) accumulating some left
               transformations on a n x n array instead of on an m x n
               array, the resulting algorithm is often more efficient than
               the Golub-Reinsch algorithm, especially for matrices with
               considerably more rows than columns (m >> n), such as in
               least squares applications. The two algorithms are compared
               in terms of operation counts, and computational experiments
               that have been carried out verify the theoretical
               comparisons. The modified algorithm is more efficient even
               when m is only slightly greater than n, and in some cases can
               achieve as much as 50% savings when m >> n. If accumulation
               of left transformations is desired, then $n^2$ extra storage
               locations are required (relatively small if m >> n), but
               otherwise no extra storage is required. The modified
               algorithm uses only orthogonal transformations and is
               therefore numerically stable. In the Appendix, we give the
               Fortran code of a hybrid method which automatically selects
               the more efficient of the two algorithms to use depending
               upon the input values for m and n.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/588/CS-TR-77-588.pdf

%R CS-TR-77-589
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A knowledge-based system for the interpretation of protein
               x-ray crystallographic data
%A Engelmore, Robert S.
%A Nii, H. Penny
%D February 1977
%X The broad goal of this project is to develop intelligent
               computational systems to infer the three-dimensional
               structures of proteins from x-ray crystallographic data. The
               computational systems under development use both formal and
               judgmental knowledge from experts to select appropriate
               procedures and to constrain the space of plausible protein
               structures. The hypothesis generating and testing procedures
               operate upon a variety of representations of the data, and
               work with several different descriptions of the structure
               being inferred. The system consists of a number of
               independent but cooperating knowledge sources which propose,
               augment and verify a solution to the problem as it is
               incrementally generated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/589/CS-TR-77-589.pdf

%R CS-TR-77-593
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Explanation capabilities of production-based consultation
               systems
%A Scott, A. Carlisle
%A Clancey, William J.
%A Davis, Randall
%A Shortliffe, Edward H.
%D February 1977
%X A computer program that models an expert in a given domain is
               more likely to be accepted by experts in that domain, and by
               non-experts seeking its advice, if the system can explain its
               actions. An explanation capability not only adds to the
               system's credibility, but also enables the non-expert user to
               learn from it. Furthermore, clear explanations allow an
               expert to check the system's "reasoning", possibly
               discovering the need for refinements and additions to the
               system's knowledge base. In a developing system, an
               explanation capability can be used as a debugging aid to
               verify that additions to the system are working as they
               should.
               This paper discusses the general characteristics of
               explanation systems: what types of explanations they should
               be able to give, what types of knowledge will be needed in
               order to give these explanations, and how this knowledge
               might be organized. The explanation facility in MYCIN is
               discussed as an illustration of how the various problems
               might be approached.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/593/CS-TR-77-593.pdf

%R CS-TR-77-596
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A review of knowledge based problem solving as a basis for a
               genetics experiment designing system
%A Stefik, Mark J.
%A Martin, Nancy
%D February 1977
%X It is generally accepted that problem solving systems require
               a wealth of domain specific knowledge for effective
               performance in complex domains. This report takes the view
               that all domain specific knowledge should be expressed in a
               knowledge base. With this in mind, the ideas and techniques
               from problem solving and knowledge base research are reviewed
               and outstandlng problems are identified. Finally, a task
               domain is characterized in terms of objects, actions, and
               control/strategy knowledge and suggestions are made for
               creating a uniform knowledge base management system to be
               used for knowledge acquisition, problem solving, and
               explanation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/596/CS-TR-77-596.pdf

%R CS-TR-77-597
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Model-directed learning of production rules
%A Buchanan, Bruce G.
%A Mitchell, Tom M.
%D March 1977
%X The Meta-DENDRAL program is described in general terms that
               are intended to clarify the similarities and differences to
               other learning programs. Its approach of model-directed
               heuristic search through a complex space of possible rules
               appears well suited to many induction tasks. The use of a
               strong model of the domain to direct the rule search has been
               demonstrated for rule formation in two areas of chemistry.
               The high performance of programs which use the generated
               rules attests to the success of this learning strategy.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/597/CS-TR-77-597.pdf

%R CS-TR-77-602
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The numerically stable reconstruction of a Jacobi matrix from
               spectral data
%A Boor, Carl de
%A Golub, Gene H.
%D March 1977
%X We show how to construct, from certain spectral data, a
               discrete inner product for which the associated sequence of
               monic orthogonal polynomials coincides with the sequence of
               appropriately normalized characteristic polynomials of the
               left principal submatrices of the Jacobi matrix. The
               generation of these orthogonal polynomials via their three
               term recurrence relation, as popularized by Forsythe, then
               provides a stable means of computing the entries of the
               Jacobi matrix. The resulting algorithm might be of help in
               the approximate solution of inverse eigenvalue problems for
               Sturm-Liouville equations.
               Our construction provides, incidentally, very simple proofs
               of known results concerning existence and uniqueness of a
               Jacobi matrix satisfying given spectral data and its
               continuous dependence on that data.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/602/CS-TR-77-602.pdf

%R CS-TR-77-603
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Reference machines require non-linear time to maintain
               disjoint sets
%A Tarjan, Robert Endre
%D March 1977
%X This paper describes a machine model intended to be useful in
               deriving realistic complexity bounds for tasks requiring list
               processing. As an example of the use of the model, the paper
               shows that any such machine requires non-linear time in the
               worst case to compute unions of disjoint sets on-line. All
               set union algorithms known to me are instances of the model
               and are thus subject to the derived bound. One of the known
               algorithms achieves the bound to within a constant factor.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/603/CS-TR-77-603.pdf

%R CS-TR-77-604
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Control of the dissipativity of Lax-Wendroff type methods for
               first order systems or hyperbolic equations
%A Chan, Tony Fan C.
%A Oliger, Joseph
%D March 1977
%X Lax-Wendroff methods for hyperbolic systems have two
               characteristics which are sometimes troublesome. They are
               sometimes too dissipative -- they may smooth the solution
               excessively -- and their dissipative behavior does not affect
               all modes of the solution equally. Both of these difficulties
               can be remedied by adding properly chosen accretive terms. We
               develop modifications of the Lax-Wendroff method which
               equilibrate the dissipativity over the fundamental modes of
               the solution and allow the magnitude of the dissipation to be
               controlled. We show that these methods are stable for the
               mixed initial boundary value problem and develop analogous
               formulations for the two-step Lax-Wendroff and MacCormack
               methods.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/604/CS-TR-77-604.pdf

%R CS-TR-77-605
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A model for learning systems
%A Smith, Reid G.
%A Mitchell, Tom M.
%A Chestek, Richard A.
%A Buchanan, Bruce G.
%D March 1977
%X A model for learning systems is presented, and representative
               AI, pattern recognition, and control systems are discussed in
               terms of its framework. The model details the functional
               components felt to be essential for any learning system,
               independent of the techniques used for its construction, and
               the specific environment in which it operates. These
               components are performance element, instance selector,
               critic, learning element, blackboard, and world model.
               Consideration of learning system design leads naturally to
               the concept of a layered system, each layer operating at a
               different level of abstraction.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/605/CS-TR-77-605.pdf

%R CS-TR-77-606
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming and problem-solving seminar
%A Clancy, Michael J.
%A Knuth, Donald E.
%D April 1977
%X This report contains edited transcripts of the discussions
               held in Stanford's course CS 204, Problem Seminar, during
               autumn quarter 1976. Since the topics span a large range of
               ideas in computer science, and since most of the important
               research paradigms and programming paradigms came up during
               the discussions, the notes may be of use to graduate students
               of computer science at other universities, as well as to
               their professors and to professional people in the "real
               world".
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/606/CS-TR-77-606.pdf

%R CS-TR-77-607
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Specifications and proofs for abstract data types in
               concurrent programs
%A Owicki, Susan S.
%D April 1977
%X Shared abstract data types, such as queues and buffers, are
               useful tools for building well-structured concurrent
               programs. This paper presents a method for specifying shared
               types in a way that simplifies concurrent program
               verification. The specifications describe the operations of
               the shared type in terms of their effect on variables of the
               process invoking the operation. This makes it possible to
               verify the processes independently, reducing the complexity
               of the proof. The key to defining such specifications is the
               concept of a private variable: a variable which is part of a
               shared object but belongs to just one process. Shared types
               can be implemented using an extended form of monitors; proof
               rules are given for verifying that a monitor correctly
               implements its specifications. Finally, it is shown how
               concurrent programs can be verified using the specifications
               of their shared types. The specification and proof techniques
               are illustrated with a number of examples involving a shared
               bounded buffer.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/607/CS-TR-77-607.pdf

%R CS-TR-77-609
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Complexity of combinatorial algorithms
%A Tarjan, Robert Endre
%D April 1977
%X This paper examines recent work on the complexity of
               combinatorial algorithms, highlighting the aims of the work,
               the mathematical tools used, and the important results.
               Included are sections discussing ways to measure the
               complexity of an algorithm, methods for proving that certain
               problems are very hard to solve, tools useful in the design
               of good algorithms, and recent improvements in algorithms for
               solving ten representative problems. The final section
               suggests some directions for future research.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/609/CS-TR-77-609.pdf

%R CS-TR-77-611
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The logic of computer programming
%A Manna, Z ohar
%A Waldinger, Richard J.
%D August 1977
%X Techniques derived from mathematical logic promise to provide
               an alternative to the conventional methodology for
               constructing, debugging, and optimizing computer programs.
               Ultimately, these techniques are intended to lead to the
               automation of many of the facets of the programming process.
               This paper provides a unified tutorial exposition of the
               logical techniques, illustrating each with examples. The
               strengths and limitations of each technique as a practical
               programming aid are assessed and attempts to implement these
               methods in experimental systems are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/611/CS-TR-77-611.pdf

%R CS-TR-77-614
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The convergence of functions to fixedpoints of recursive
               definitions
%A Manna, Z ohar
%A Shamir, Adi
%D May 1977
%X The classical method for constructing the least fixedpoint of
               a recursive definition is to generate a sequence of functions
               whose initial element is the totally undefined function and
               which converges to the desired least fixedpoint. This method,
               due to Kleene, cannot be generalized to allow the
               construction of other fixedpoints.
               In this paper we present an alternate definition of
               convergence and a new fixedpoint access method of generating
               sequences of functions for a given recursive definition. The
               initial function of the sequence can be an arbitrary
               function, and the sequence will always converge to a
               fixedpoint that is "close" to the initial function. This
               defines a monotonic mapping from the set of partial functions
               onto the set of all fixedpoints of the given recursive
               definition.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/614/CS-TR-77-614.pdf

%R CS-TR-77-615
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Numerical methods for the first biharmonic equation and for
               the two-dimensional Stokes problem
%A Glowinski, Roland
%A Pironneau, Olivier
%D May 1977
%X We describe in this report various methods, iterative and
               "almost direct," for solving the first biharmonic problem on
               general two-dimensional domains once the continuous problem
               has been approximated by an appropriate mixed finite element
               method. Using the approach described in this report we
               recover some well known methods for solving the first
               biharmonic equation as a system of coupled harmonic
               equations, but some of the methods discussed here are
               completely new, including a conjugate gradient type
               algorithm. In the last part of this report we discuss the
               extension of the above methods to the numerical solution of
               the two dimensional Stokes problem in p- connected domains (p
               $\geq$ 1) through the stream function-vorticity formulation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/615/CS-TR-77-615.pdf

%R CS-TR-77-616
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stability of the Fourier method
%A Kreiss, Heinz-Otto
%A Oliger, Joseph
%D August 1977
%X In this paper we develop a stability theory for the Fourier
               (or pseudo-spectral) method for linear hyperbolic and
               parabolic partial differential equations with variable
               coefficients.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/616/CS-TR-77-616.pdf

%R CS-TR-77-618
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A production system for automatic deduction
%A Nilsson, Nils J.
%D August 1977
%X A new predicate calculus deduction system based on production
               rules is proposed. The system combines several developments
               in Artificial Intelligence and Automatic Theorem Proving
               research including the use of domain-specific inference rules
               and separate mechanisms for forward and backward reasoning.
               It has a clean separation between the data base, the
               production rules, and the control system. Goals and subgoals
               are maintained in an AND/OR tree to represent assertions. The
               production rules modify these structures untll they "connect"
               in a fashion that proves the goal theorem. Unlike some
               previous systems that used production rules, ours is not
               limited to rules in Horn Clause form. Unlike previous
               PLANNER-like systems, ours can handle the full range of
               predicate calculus expressions including those with
               quantified variables, disjunctions and negations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/618/CS-TR-77-618.pdf

%R CS-TR-77-619
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Time-space trade-offs in a pebble game
%A Paul, Wolfgang J.
%A Tarjan, Robert Endre
%D July 1977
%X A certain pebble game on graphs has been studied in various
               contexts as a model for the time and space requirements of
               computations. In this note it is shown that there exists a
               family of directed acyclic graphs $G_n$ and constants $c_1$,
               $c_2$, $c_3$ such that
               (1) $G_n$ has n nodes and each node in $G_n$ has indegree at
               most 2.
               (2) Each graph $G_n$ can be pebbled with $c_1\sqrt{n}$
               pebbles in n moves.
               (3) Each graph $G_n$ can also be pebbled with $C_2\sqrt{n}$
               pebbles, $c_2$ < $c_1$, but every strategy which achieves
               this has at least $2^{c_3\sqrt{n}}$ moves.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/619/CS-TR-77-619.pdf

%R CS-TR-77-621
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The art of artificial intelligence: I. Themes and case
               studies of knowledge engineering
%A Feigenbaum, Edward A.
%D August 1977
%X The knowledge engineer practices the art of bringing the
               principles and tools of AI research to bear on difficult
               applications problems requiring experts' knowledge for their
               solution. The technical issues of acquiring this knowledge,
               representing it, and using it appropriately to construct and
               explain lines-of-reasoning, are important problems in the
               design of knowledge-based systems. Various systems that have
               achieved expert level performance in scientific and medical
               inference illuminates the art of knowledge engineering and
               its parent science, Artificial Intelligence.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/621/CS-TR-77-621.pdf

%R CS-TR-77-624
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Recent research in computer science.
%A McCarthy, John
%A Binford, Thomas O.
%A Green, Cordell C.
%A Luckham, David C.
%A Manna, Z ohar
%A Winograd, Terry A.
%A Earnest, Lester D.
%D June 1977
%X This report summarizes recent accomplishments in six related
               areas: (1) basic AI research and formal reasoning, (2) image
               understanding, (3) mathematical theory of computation, (4)
               program verification, (5) natural language understanding, and
               (6) knowledge based programming.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/624/CS-TR-77-624.pdf

%R CS-TR-77-625
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A fast merging algorithm
%A Brown, Mark R.
%A Tarjan, Robert Endre
%D August 1977
%X We give an algorithm which merges sorted lists represented as
               balanced binary trees. If the lists have lengths m and n (m
               $\leq$ n), then the merging procedure runs in O(m log n/m)
               steps, which is the same order as the lower bound on all
               comparison-based algorithms for this problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/625/CS-TR-77-625.pdf

%R CS-TR-77-626
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the loop switching addressing problem
%A Yao, Andrew Chi-Chih
%D October 1977
%X The following graph addressing problem was studied by Graham
               and Pollak in devising a routing scheme for Pierce's Loop
               Switching Network. Let G be a graph with n vertices. It is
               desired to assign to each vertex $v_i$ an address in
               ${{0,1,*}}^\ell$, such that the Hamming distance between the
               addresses of any two vertices agrees with their distance in
               G. Let N(G) be the minimum length $\ell$ for which an
               assignment is possible. It was shown by Graham and Pollak
               that N(G) $\leq m_G$(n-1), where $m_G$ is the diameter of G.
               In the present paper, we shall prove that N(G) $\leq 1.09(lg
               m_G$)n + 8n by an explicit construction. This shows in
               particular that any graph has an addressing scheme of length
               O(n log n).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/626/CS-TR-77-626.pdf

%R CS-TR-77-627
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A separator theorem for planar graphs
%A Lipton, Richard J.
%A Tarjan, Robert Endre
%D October 1977
%X Let G be any n-vertex planar graph. We prove that the
               vertices of G can be partitioned into three sets A,B,C such
               that no edge joins a vertex in A with a vertex in B, neither
               A nor B contains more than 2n/3 vertices, and C contains no
               more than $2\sqrt{2}\sqrt{2}$ vertices. We exhibit an
               algorithm which finds such a partition A,B,C in O(n) time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/627/CS-TR-77-627.pdf

%R CS-TR-77-628
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Applications of a planar separator theorem
%A Lipton, Richard J.
%A Tarjan, Robert Endre
%D October 1977
%X Any n-vertex planar graph has the property that it can be
               divided into components of roughly equal size by removing
               only O($\sqrt{n}$) vertices. This separator theorem, in
               combination with a divide-and-conquer strategy, leads to many
               new complexity results for planar graph problems. This paper
               describes some of these results.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/628/CS-TR-77-628.pdf

%R CS-TR-77-629
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The complexity of pattern matching for a random string
%A Yao, Andrew Chi-Chih
%D October 1977
%X We study the average-case complexity of finding all
               occurrences of a given pattern $\alpha$ in an input text
               string. Over an alphabet of q symbols, let c($\alpha$,n) be
               the minimum average number of characters that need to be
               examined in a random text string of length n. We prove that,
               for large m, almost all patterns $\alpha$ of length m satisfy
               c($\alpha$,n) = $\Theta (\lceil \log_q (${n-m}/{ln m} +
               2)\rceil )$ if $m \leq n \leq 2m$, and c($\alpha$,n) =
               $\Theta ({\lceil \log_q m\rceil}/m n)$ if n > 2m. This in
               particular confirms a conjecture raised in a recent paper by
               Knuth, Morris, and Pratt [1977].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/629/CS-TR-77-629.pdf

%R CS-TR-77-631
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Inference rules for program annotation
%A Dershowitz, Nachum
%A Manna, Z ohar
%D October 1977
%X Methods are presented whereby an Algol-like program, given
               together with its specifications, can be documented
               automatically. The program is incrementally annotated with
               invariant relationships that hold between program variables
               at intermediate points in the program and explain the acutal
               workings of the program regardless of whether the program is
               correct. Thus this documentation can be used for proving the
               correctness of the program or may serve as an aid in the
               debugging of an incorrect program.
               The annotation techniques are formulated as Hoare-llike
               inference rules which derive invariants from the assignment
               statements, from the control structure of the program, or,
               heuristically, from suggested invariants. The application of
               these rules is demonstrated by two examples which have run on
               an experimental implementation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/631/CS-TR-77-631.pdf

%R CS-TR-77-634
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A new proof of global convergence for the tridiagonal QL
               algorithm
%A Hoffmann, Walter
%A Parlett, Beresford N.
%D October 1977
%X By exploiting the relation of the QL algorithm to inverse
               iteration we obtain a proof of global convergence which is
               more conceptual and less computational than previous
               analyses. The proof uses a new, but simple, error estimate
               for the first step of inverse iteration.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/634/CS-TR-77-634.pdf

%R CS-TR-77-635
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A block Lanczos method to compute the singular values and
               corresponding singular vectors of a matrix
%A Golub, Gene H.
%A Luk, Franklin T.
%A Overton, Michael L.
%D October 1977
%X We present a block Lanczos method to compute the largest
               singular values and corresponding left and right singular
               vectors of a large sparse matrix. Our algorithm does not
               transform the matrix A but accesses it only through a
               user-supplied routine which computes AX or $A^t$X for a given
               matrix X.
               This paper also includes a thorough discussion of the various
               ways to compute the singular value decomposition of a banded
               upper triangular matrix; this problem arises as a subproblem
               to be solved during the block Lanczos procedure.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/635/CS-TR-77-635.pdf

%R CS-TR-77-636
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T $C^m$ convergence of trigonometric interpolants
%A Bube, Kenneth P.
%D October 1977
%X For m $\geq$ 0, we obtain sharp estimates of the uniform
               accuracy of the m-th derivative of the n-point trigonometric
               interpolant of a function for two classes of periodic
               functions on R. As a corrollary, the n-point interpolant of a
               function in $C^k$ uniformly approximates the function to
               order o($n^{1/2-k}$), improving the recent estimate of
               O($n^{1-k}$). These results remain valid if we replace the
               trigonometric interpolant by its K-th partial sum, replacing
               n by K in the estimates.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/636/CS-TR-77-636.pdf

%R CS-TR-77-637
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the gap structure of sequences of points on a circle
%A Ramshaw, Lyle H.
%D November 1977
%X Considerable mathematical effort has gone into studying
               sequences of points in the interval (0,1) which are evenly
               distributed, in the sense that certain intervals contain
               roughly the correct percentages of the first n points. This
               paper explores the related notion in which a sequence is
               evenly distributed if its first n points split a given circle
               into intervals which are roughly equal in length, regardless
               of their relative positions. The sequence $x_k$ =
               ($\log_2$(2k-1) mod 1) was introduced in this context by
               DeBruijn and Erdoes. We will see that the gap structure of
               this sequence is uniquely optimal in a certain sense, and
               optimal under a wide class of measures.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/637/CS-TR-77-637.pdf

%R CS-TR-77-638
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A generalized conjugate gradient algorithm for solving a
               class of quadratic programming problems
%A O'Leary, Dianne Prost
%D December 1977
%X In this paper we apply matrix splitting techniques and a
               conjugate gradient algorithm to the problem of minimizing a
               convex quadratic form subject to upper and lower bounds on
               the variables. This method exploits sparsity structure in the
               matrix of the quadratic form. Choices of the splitting
               operator are discussed and convergence results are
               established. We present the results of numerical experiments
               showing the effectiveness of the algorithm on free boundary
               problems for elliptic partial differential equations, and we
               give comparisons with other algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/638/CS-TR-77-638.pdf

%R CS-TR-77-639
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On program synthesis knowledge
%A Green, Cordell C.
%A Barstow, David R.
%D November 1977
%X This paper presents a body of program synthesis knowledge
               dealing with array operations, space reutilization, the
               divide and conquer paradigm, conversion from recursive
               paradigms to iterative paradigms, and ordered set
               enumerations. Such knowledge can be used for the synthesis of
               efficient and in-place sorts including quicksort, mergesort,
               sinking sort, and bubble sort, as well as other ordered set
               operations such as set union, element removal, and element
               addition. The knowledge is explicated to a level of detail
               such that it is possible to codify this knowledge as a set of
               program synthesis rules for use by a computer-based synthesis
               system. The use and content of this set of programming rules
               is illustrated herein by the methodical synthesis of bubble
               sort, sinking sort, quicksort, and mergesort.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/639/CS-TR-77-639.pdf

%R CS-TR-77-640
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Structured programming with recursion
%A Manna, Z ohar
%A Waldinger, Richard J.
%D January 1978
%X No abstract available.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/640/CS-TR-77-640.pdf

%R CS-TR-77-642
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On constructing minimum spanning trees in k-dimensional
               spaces and related problems
%A Yao, Andrew Chi-Chih
%D December 1977
%X The problem of finding a minimum spanning tree connecting n
               points in a k-dimensional space is discussed under three
               common distance metrics -- Euclidean, rectilinear, and
               $L_\infty$. By employing a subroutine that solves the post
               office problem, we show that, for fixed k $\geq$ 3, such a
               minimum spanning tree can be found in time O($n^{2-a(k)}
               {(log n)}^{1-a(k)}$), where a(k) = $2^{-(k+1)}$. The bound
               can be improved to O(${(n log n)}^{1.8}$) for points in the
               3-dimensional Euclidean space. We also obtain o($n^2$)
               algorithms for finding a farthest pair in a set of n points
               and for other related problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/642/CS-TR-77-642.pdf

%R CS-TR-77-645
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Generalized nested dissection
%A Lipton, Richard J.
%A Rose, Donald J.
%A Tarjan, Robert Endre
%D December 1977
%X J. A. George has discovered a method, called nested
               dissection, for solving a system of linear equations defined
               on an n = k $\times$ k square grid in O(n log n) space and
               O($n{3/2}$) time. We generalize this method without degrading
               the time and space bounds so that it applies to any system of
               equations defined on a planar or almost-planar graph. Such
               systems arise in the solution of two-dimensional finite
               element problems. Our method uses the fact that planar graphs
               have good separators.
               More generally, we show that sparse Gaussian elimination is
               efficient for any class of graphs which have good separators,
               and conversely that graphs without good separators (including
               almost all sparse graphs) are not amenable to sparse Gaussian
               elimination.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/645/CS-TR-77-645.pdf

%R CS-TR-77-646
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fast decision algorithms based on congruence closure
%A Nelson, Charles Gregory
%A Oppen, Derek C.
%D February 1978
%X We define the notion of the 'congruence closure' of a
               relation on a graph and give a simple algorithm for computing
               it. We then give decision procedures for the quantifier-free
               theory of equality and the quantifier-free theory of LISP
               list structure, both based on this algorithm. The procedures
               are fast enough to be practical in mechanical theorem
               proving: each procedure determines the satisfiability of a
               conjunction of length n of literals in time O($n^2$). We also
               show that if the axiomatization of the theory of list
               structure is changed slightly, the problem of determining the
               satisfiability of a conjunction of literals becomes
               NP-complete. We have implemented the decision procedures in
               our simplifier for the Stanford Pascal Verifier.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/646/CS-TR-77-646.pdf

%R CS-TR-77-647
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A lower bound to palindrome recognition by probabilistic
               Turing machines
%A Yao, Andrew Chi-Chih
%D December 1977
%X We call attention to the problem of proving lower bounds on
               probabilistic Turing machine computations. It is shown that
               any probabilisitc Turing machine recognizing the language L =
               {w $\phi$ w | w $\epsilon$ ${{0,1}}^*$} with error $\lambda$
               < 1/2 must take $\Omega$(n log n) time.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/647/CS-TR-77-647.pdf

%R CS-TR-77-432
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A users manual for FOL.
%A Weyhrauch, Richard W.
%D July 1977
%X This manual explains how to use of the proof checker FOL, and
               supersedes all previous manuals. FOL checks proofs of a
               natural deduction style formulation of first order functional
               calculus with equality augumented in the following ways:
               (i) it is a many-sorted first-order logic in which a partial
               order over the sorts may be specified; (ii) conditional
               expressions are allowed for forming terms (iii) axiom
               schemata with predicate and function parameters are allowed
               (iv) purely propositional deductions can be made in a single
               step; (v) a partial model of the language can be built in a
               LISP environment and some deductions can be made by direct
               computation in this model; (vi) there is a limited ability to
               make metamathematical arguments; (vii) there are many
               operational conveniences.
               A major goal of FOL is to create an environment where formal
               proofs can be carefully examined with the eventual aim of
               designing practical tools for manipulating proofs in pure
               mathematics and about the correctness of programs. This
               includes checking proofs generated by other programs. FOL is
               also a research tool in modeling common-sense reasoning
               including reasoning about knowledge and belief.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/77/432/CS-TR-77-432.pdf

%R CS-TR-76-533
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A generalized conjugate gradient method for the numerical
               solution of elliptic partial differential equations
%A Concus, Paul
%A Golub, Gene H.
%A O'Leary, Dianne Prost
%D January 1976
%X We consider a generalized conjugate gradient method for
               solving sparse, symmetric, positive-definite systems of
               linear equations, principally those arising from the
               discretization of boundary value problems for elliptic
               partial differential equations. The method is based on
               splitting off from the original coefficient matrix a
               symmetric, positive-definite one that corresponds to a more
               easily solvable system of equations, and then accelerating
               the associated iteration using conjugate gradients.
               Optimality and convergence properties are presented, and the
               relation to other methods is discussed. Several splittings
               for which the method seems particularly effective are also
               discussed, and for some, numerical examples are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/533/CS-TR-76-533.pdf

%R CS-TR-76-535
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A generalized conjugate gradient method for nonsymmetric
               systems of linear equations
%A Concus, Paul
%A Golub, Gene H.
%D January 1976
%X We consider a generalized conjugate gradient method for
               solving systems of linear equations having nonsymmetric
               coefficient matrices with positive-definite symmetric part.
               The method is based on splitting the matrix into its
               symmetric and skew-symmetric parts, and then accelerating the
               associated iteration using conjugate gradients, which
               simplifies in this case, as only one of the two usual
               parameters is required. The method is most effective for
               cases in which the symmetric part of the matrix corresponds
               to an easily solvable system of equations. Convergence
               properties are discussed, as well as an application to the
               numerical solution of elliptic partial differential
               equations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/535/CS-TR-76-535.pdf

%R CS-TR-76-540
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Addition chains with multiplicative cost
%A Graham, Ronald L.
%A Yao, Andrew Chi-Chih
%A Yao, F. Frances
%D January 1976
%X If each step in an addition chain is assigned a cost equal to
               the product of the numbers added at that step, "binary"
               addition chains are shown to minimize total cost.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/540/CS-TR-76-540.pdf

%R CS-TR-76-542
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The theoretical aspects of the optimal fixedpoint
%A Manna, Z ohar
%A Shamir, Adi
%D March 1976
%X In thls paper we define a new type of fixedpoint of recursive
               definitions and investigate some of its properties. This
               optimal fixedpoint (which always uniquely exists) contains,
               in some sense, the maximal amount of "interesting"
               information which can be extracted from the recursive
               definition, and it may be strictly more defined than the
               program's least fixedpoint. This fixedpoint can be the basis
               for assigning a new semantics to recursive programs.
               This is a modified and extended version of part 1 of a paper
               presented at the Symposium on Theory of Computing,
               Albuquerque, New Mexico (May 1975).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/542/CS-TR-76-542.pdf

%R CS-TR-76-543
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Optimal polyphase sorting
%A Z ave, Derek A.
%D March 1976
%X A read-forward polyphase merge algorithm is described which
               performs the polyphase merge starting from an arbitrary
               string distribution. This algorithm minimizes the volume of
               information moved. Since this volume is easily computed, it
               is possible to construct dispersion algorithms which
               anticipate the merge algorithm. Two such dispersion
               techniques are described. The first algorithm requires that
               the number of strings to be dispersed be known in advance;
               this algorithm is optimal. The second algorithm makes no such
               requirement, but is not always optimal. In addition,
               performance estimates are derived and both algorithmns are
               shown to be asymptotically optimal.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/543/CS-TR-76-543.pdf

%R CS-TR-76-544
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Removing trivial assignments from programs
%A Mont-Reynaud, Bernard
%D March 1976
%X An assignment X $\leftarrow$ Y in a program is "trivial" when
               both X and Y are simple program variables. The paper
               describes a transformation which removes all such assignments
               from a program P, producing a program P' which executes
               faster than P but usually has a larger size. The number of
               variables used by P' is also minimized. Worst-case analysis
               of the transformation algorithm leads to nonpolynomial
               bounds. Such inefficiency, however, does not arise in typical
               situations, and the technique appears to be of interest for
               practical compiler optimization.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/544/CS-TR-76-544.pdf

%R CS-TR-76-545
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Space bounds for a game on graphs
%A Paul, Wolfgang L.
%A Tarjan, Robert Endre
%A Celoni, James R.
%D March 1976
%X We study a one-person game played by placing pebbles,
               according to certain rules, on the vertices of a directed
               graph. In [J. Hopcroft, W. Paul, and L. Valiant, "On time
               versus space and related problems," Proc. 16th Annual Symp.
               on Foundations of Computer Science (1975), pp.57-64] it was
               shown that for each graph with n vertices and maximum
               in-degree d, there is a pebbling strategy which requires at
               most c(d) n/log n pebbles. Here we show that this bound is
               tight to within a constant factor. We also analyze a variety
               of pebbling algorithms, including one which achieves the
               O(n/log n) bound.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/545/CS-TR-76-545.pdf

%R CS-TR-76-547
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Iterative algorithms for global flow analysis
%A Tarjan, Robert Endre
%D March 1976
%X This paper studies iterative methods for the global flow
               analsis of computer programs. We define a hierarchy of global
               flow problem classes, each solvable by an appropriate
               generalization of the "node listing" method of Kennedy. We
               show that each of these generalized methods is optimum, among
               all iterative algorithms, for solving problems within its
               class. We give lower bounds on the time required by iterative
               algorithms for each of the problem classes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/547/CS-TR-76-547.pdf

%R CS-TR-76-549
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automatic program verification V: verification-oriented proof
               rules for arrays, records and pointers
%A Luckham, David C.
%A Suzuki, Norihisa
%D March 1976
%X A practical method is presented for automating in a uniform
               way the verification of Pascal programs that operate on the
               standard Pascal data structures ARRAY, RECORD, and POINTER.
               New assertion language primitives are introduced for
               describing computational effects of operations on these data
               structures. Axioms defining the semantics of the new
               primitives are given. Proof rules for standard Pascal
               operations on pointer variables are then defined in terms of
               the extended assertion language. Similar rules for records
               and arrays are special cases. An extensible axiomatic rule
               for the Pascal memory allocation operation, NEW, is also
               given.
               These rules have been implemented in the Stanford Pascal
               program verifier. Examples illustrating the verification of
               programs which operate on list structures implemented with
               pointers and records are discussed. These include programs
               with side-effects.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/549/CS-TR-76-549.pdf

%R CS-TR-76-550
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Finding a maximum independent set
%A Tarjan, Robert Endre
%A Trojanowski, Anthony E.
%D June 1976
%X We present an algorithm which finds a maximum independent set
               in an n-vertex graph in 0($2^{n/3}$) time. The algorithm can
               thus handle graphs roughly three times as large as could be
               analyzed using a naive algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/550/CS-TR-76-550.pdf

%R CS-TR-76-551
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The state of the Art of Computer Programming
%A Knuth, Donald E.
%D June 1976
%X This report lists all corrections and changes to volumes 1
               and 3 of "The Art of Computer Programming," as of May 14,
               1976. The changes apply to the most recent printings of both
               volumes (February and March, 1975); if you have an earlier
               printing there have been many other changes not indicated
               here.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/551/CS-TR-76-551.pdf

%R CS-TR-76-553
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Complexity of monotone networks for computing conjunctions
%A Tarjan, Robert Endre
%D June 1976
%X Let $F_1$, $F_2$,..., $F_m$ be a set of Boolean functions of
               the form $F_i$ = $\wedge$ {x$\in X_i$}, where $\wedge$
               denotes conjunction and each $X_i$ is a subset of a set X of
               n Boolean variables. We study the size of monotone Boolean
               networks for computing such sets of functions. We exhibit
               anomalous sets of conujunctions whose smallest monotone
               networks contain disjunctions. We show that if |$F_i$| is
               sufficiently small for all i, such anomalies cannot happen.
               We exhibit sets of m conjunctions in n unknowns which require
               $c_2$m$\alpha$(m,n) binary conjunctions, where $\alpha$(m,n)
               is a very slowly growing function related to a functional
               inverse of Ackermann's function. This class of examples shows
               that an algorithm given in [STAN-CS-75-512] for computing
               functions defined on paths in trees is optimum to within a
               constant factor.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/553/CS-TR-76-553.pdf

%R CS-TR-76-555
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Monte Carlo simulation of tolerancing in discrete parts
               manufacturing and assembly
%A Grossman, David D.
%D May 1976
%X The assembly of discrete parts is strongly affected by
               imprecise components, imperfect fixtures and tools, and
               inexact measurements. It is often necessary to design higher
               precision into the manufacturing and assembly process than is
               functionally needed in the final product. Production
               engineers must trade off between alternative ways of
               selecting individual tolerances in order to achieve minimum
               cost while preserving product integrity. This paper describes
               a comprehensive Monte Carlo method for systematically
               analysing the stochastic implications of tolerancing and
               related forms of imprecision. The method is illustrated by
               four examples, one of which is chosen from the field of
               assembly by computer controlled manipulators.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/555/CS-TR-76-555.pdf

%R CS-TR-76-558
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Is "sometime" sometimes better than "always"? Intermittent
               assertions in proving program correctness
%A Manna, Z ohar
%A Waldinger, Richard J.
%D March 1977
%X This paper explores a technique for proving the correctness
               and termination of programs simultaneously. This approach,
               which we call the intermittent-assertion method, involves
               documenting the program with assertions that must be true at
               some time when control passes through the corresponding
               point, but that need not be true every time. The method,
               introduced by Burstall, promises to provide a valuable
               complement to the more conventional methods.
               We first introduce the intermittent-assertion method with a
               number of examples of correctness and termination proofs.
               Some of these proofs are markedly simpler than their
               conventional counterparts. On the other hand, we show that a
               proof of correctness or termination by any of the
               conventional techniques can be rephrased directly as a proof
               using intermittent assertions. Finally, we show how the
               intermittent assertion method can be applied to prove the
               validity of program transformations and the correctness of
               continuously operating programs.
               This is a revised and simplified version of a previous paper
               with the same title (AIM-281, June 1976).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/558/CS-TR-76-558.pdf

%R CS-TR-76-559
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Rank degeneracy and least squares problems
%A Golub, Gene H.
%A Klema, Virginia C.
%A Stewart, Gilbert W.
%D August 1976
%X This paper is concerned with least squares problems when the
               least squares matrix A is near a matrix that is not of full
               rank. A definition of numerical rank is given. It is shown
               that under certain conditions when A has numerical rank r
               there is a distinguished r dimensional subspace of the column
               space of A that is insensitive to how it is approximated by r
               independent columns of A. The consequences of this fact for
               the least squares problem are examined. Algorithms are
               described for approximating the stable part of the column
               space of A.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/559/CS-TR-76-559.pdf

%R CS-TR-76-561
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Mathematical Programming Language -- user's guide
%A Woods, Donald R.
%D August 1976
%X Mathematical Programming Language (MPL) is a aprogramming
               language specifically designed for the implementation of
               mathematical software and, in particular, experimental
               mathematical programming software. In the past there has been
               a wide gulf between the applied mathematicians who design
               mathematical algorithms (but often have little appreciation
               of the fine points of computing) and the professional
               programmer, who may have little or no understanding of the
               mathematics of the problem he is programming. The result is
               that a vast number of mathematical algorithms have been
               devised and published, with only a small fraction being
               actually implemented and experimentally compared on selected
               representative problems.
               MPL is designed to be as close as possible to the terminology
               used by the mathematician while retaining as far as possible
               programming sophistications which make for good software
               systems. The result is a programming langauge which
               (hopefully!) allows the writing of clear, concise, easily
               read programs, especially by persons who are not professional
               programmers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/561/CS-TR-76-561.pdf

%R CS-TR-76-568
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Exploratory study of computer integrated assembly systems.
               Progress report 3, covering the period December 1, 1975 to
               July 31, 1976
%A Binford, Thomas O.
%A Grossman, David D.
%A Liu, C. Richard
%A Bolles, Robert C.
%A Finkel, Raphael A.
%A Mujtaba, M. Shahid
%A Roderick, Michael D.
%A Shimano, Bruce E.
%A Taylor, Russell H.
%A Goldman, Ronald H.
%A Jarvis, J. Pitts, III
%A Scheinman, Victor D.
%A Gafford, Thomas A.
%D August 1976
%X The Computer Integrated Assembly Systems project is concerned
               with developing the software technology of programmable
               assembly devices, including computer controlled manipulators
               and vision systems. A complete hardware system has been
               implemented that includes manipulators with tactile sensors
               and TV cameras, tools, fixtures, and auxiliary devices, a
               dedicated minicomputer, and a time-shared large computer
               equipped with graphic display terminals. An advanced software
               system called AL has been developed that can be used to
               program assembly applications. Research currently underway
               includes refinement of AL, development of improved languages
               and interactive programming techniques for assembly and
               vision, extension of computer vision to areas which are
               currently infeasible, geometric modeling of objects and
               constraints, assembly simulation, control algorithms, and
               adaptive methods of calibration.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/568/CS-TR-76-568.pdf

%R CS-TR-76-569
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Calculation of interpolating natural spline functions using
               de Boor's package for calculating with B-splines
%A Herriot, John G.
%D October 1976
%X A FORTRAN subroutine is described for finding interpolating
               natural splines of odd degree for an arbitrary set of data
               points. The subroutine makes use of several of the
               subroutines in de Boor's package for calculating with
               B-splines. An Algol W translation of the interpolating
               natural spline subroutine and of the required subroutines of
               the de Boor package are also given. Timing tests and accuracy
               tests for the routines are described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/569/CS-TR-76-569.pdf

%R CS-TR-76-572
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An FOL primer
%A Filman, Robert E.
%A Weyhrauch, Richard W.
%D September 1976
%X This primer is an introduction to FOL, an interactive proof
               checker for first order logic. Its examples can be used to
               learn the FOL system, or read independently for a flavor of
               our style of interactive proof checking. Several example
               proofs are presented, successively increasing in the
               complexity of the FOL commands employed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/572/CS-TR-76-572.pdf

%R CS-TR-76-573
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The stationary p-tree forest
%A Jonassen, Arne T.
%D October 1976
%X This paper contains a theoretical analysis of the conditions
               of a priority queue strategy after an infinite number of
               alternating insert/remove steps. Expected insertion time,
               expected length, etc. are found.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/573/CS-TR-76-573.pdf

%R CS-TR-76-574
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T SAIL
%A Reiser, John F.
%D August 1976
%X Sail is a high-level programming language for the PDP-10
               computer. It includes an extended ALGOL 60 compiler and a
               companion set of execution-time routines. In addition to
               ALGOL, the language features: (1) flexible linking to
               hand-coded machine language algorithms, (2) complete access
               to the PDP-10 I/O facilities, (3) a complete system of
               compile-time arithmetic and logic as well as a flexible macro
               system, (4) a high-level debugger, (5) records and
               references, (6) sets and lists, (7) an associative data
               structure, (8) independent processes, (9) procedure
               variables, (10) user modifiable error handling, (11)
               backtracking, and (12) interrupt facilities.
               This manual describes the Sail language and the
               execution-time routines for the typical Sail user: a
               non-novice programmer with some knowledge of ALGOL. It lies
               somewhere between being a tutorial and a reference manual.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/574/CS-TR-76-574.pdf

%R CS-TR-76-575
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T SAIL tutorial
%A Smith, Nancy W.
%D October 1976
%X This tutorial is designed for a beginning user of Sail, an
               ALGOL-like language for the PDP10. The first part covers the
               basic statements and expressions of the language; remaining
               topics include macros, records, conditional compilation, and
               input/output. Detailed examples of Sail programming are
               included throughout, and only a minimum of programming
               background is assumed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/575/CS-TR-76-575.pdf

%R CS-TR-76-578
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Theoretical and practical aspects of some initial-boundary
               value problems in fluid dynamics
%A Oliger, Joseph
%A Sundstroem, Arne
%D November 1976
%X Initial-boundary value problems for several systems of
               partial differential equations from fluid dynamics are
               discussed. Both rigid wall and open boundary problems are
               treated. Boundary conditions are formulated and shown to
               yield well-posed problems for the Eulerian equations for gas
               dynamics, the shallow-water equations, and linearized
               constant coefficient versions of the incompressible,
               anelastic equations. The "primitive" hydrostatic
               meteorological equations are shown to be ill-posed with any
               specification of local, pointwise boundary conditions.
               Analysis of simplified versions of this system illustrates
               the mechanism responsible for ill-posedness.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/578/CS-TR-76-578.pdf

%R CS-TR-76-579
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The A0 inversion model of program paging behavior
%A Baskett, Forest
%A Rafii, Abbas
%D November 1976
%X When the parameters of a simple stochastic model of the
               memory referencing behavior of computer programs are
               carefully selected, the model is able to mimic the paging
               behavior of a set of actual programs. The mimicry is
               successful using several different page replacement
               algorithms and a wide range of real memory sizes in a virtual
               memory environment. The model is based on the independent
               reference model with a new procedure for determining the page
               reference probabilities, the parameters of the model. We call
               the result the A0 inversion independent reference model.
               Since the fault rate (or miss ratio) is one aspect of program
               behavior that the model is able to capture for many different
               memory sizes, the model should be especially useful for
               evaluating multilevel memory organizations based on newly
               emerging memory technologies.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/579/CS-TR-76-579.pdf

%R CS-TR-76-580
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Towards a procedural understanding of semantics
%A Winograd, Terry A.
%D November 1976
%X The term "procedural semantics" has been used in a variety of
               ways, not all compatible, and not all comprehensible. In this
               paper, I have chosen to apply the term to a broad paradigm
               for studying semantics (and in fact, all of linguistics).
               This paradigm has developed in a context of writing computer
               programs which use natural language, but it is not a theory
               of computer programs or programming techniques. It is
               "procedural" because it looks at the underlying structure of
               language as fundamentally shaped by the nature of processes
               for language production and comprehension. It is based on the
               belief that there is a level of explanation at which there
               are significant similarities between the psychological
               processes of human language use and the computational
               processes in computer programs we can construct and study.
               Its goal is to develop a body of theory at this level. This
               approach necessitates abandoning or modifying several
               currently accepted doctrines, including the way in which
               distinctions have been drawn between "semantics" and
               "pragmatics" and between "performance" and "competence".
               The paper has three major sections. It first lays out the
               paradigm assumptions which guide the enterprise, and
               elaborates a model of cognitive processing and language use.
               It then illustrates how some specific semantic problems might
               be approached from a procedural perspective, and contrasts
               the procedural approach with formal structural and truth
               conditional approaches. Finally, it discusses the goals of
               linguistic theory and the nature of the linguistic
               explanation.
               Much of what is presented here is a speculation about the
               nature of a paradigm yet to be developed. This paper is an
               attempt to be evocative rather than definitive; to convey
               intuitions rather than to formulate crucial arguments which
               justify this approach over others. It will be successful if
               it suggests some ways of looking at language which lead to
               further understanding.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/580/CS-TR-76-580.pdf

%R CS-TR-76-581
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An overview of KRL, a Knowledge Representation Language
%A Bobrow, Daniel G.
%A Winograd, Terry A.
%D November 1976
%X This paper describes KRL, a Knowledge Representation Language
               designed for use in understander systems. It outlines both
               the general concepts which underlie our research and the
               details of KRL-0, an experimental implementation of some of
               these concepts. KRL is an attempt to integrate procedural
               knowledge with a broad base of declarative forms. These forms
               provide a variety of ways to express the logical structure of
               the knowledge, in order to give flexibility in associating
               procedures (for memory and reasoning) with specific pieces of
               knowledge, and to control the relative accessibility of
               different facts and descriptions. The formalism for
               declarative knowledge is based on structured conceptual
               objects with associated descriptions. These objects form a
               network of memory units with several different sorts of
               linkages, each having well-specified implications for the
               retrieval process. Procedures can be associated directly with
               the internal structure of a conceptual object. This
               procedural attachment allows the steps for a particular
               operation to be determined by characteristics of the specific
               entities involved.
               The control structure of KRL is based on the belief that the
               next generation of intelligent programs will integrate
               data-directed and goal-directed processing by using
               multi-processing. It provides for a priority-ordered
               multi-process agenda with explicit (user-provided) strategies
               for scheduling and resource allocation. It provides procedure
               directories which operate along with process frameworks to
               allow procedural parameterization of the fundamental system
               processes for building, comparing, and retrieving memory
               structures. Future development of KRL will include
               integrating procedure definition with the descriptive
               formalism.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/581/CS-TR-76-581.pdf

%R CS-TR-76-583
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Determining the stability number of a graph
%A Chvatal, Vaclav
%D December 1976
%X We formalize certain rules for deriving upper bounds on the
               stability number of a graph. The resulting system is powerful
               enough to (i) encompass the algorithms of Tarjan's type and
               (ii) provide very short proofs on graphs for which the
               stability number equals the clique-covering number. However,
               our main result shows that for almost all graphs with a
               (sufficiently large) linear number of edges, proofs within
               our system must have at least exponential length.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/583/CS-TR-76-583.pdf

%R CS-TR-76-585
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Numerical solution of nonlinear elliptic partial differential
               equations by a generalized conjugate gradient method
%A Concus, Paul
%A Golub, Gene H.
%A O'Leary, Dianne Prost
%D December 1976
%X We have studied previously a generallized conjugate gradient
               method for solving sparse positive-definite systems of linear
               equations arising from the discretization of ellilptic
               partial-differential boundary-value problems. Here,
               extensions to the nonlinear case are considered. We split the
               original discretized operator into the sum of two operators,
               one of which corresponds to a more easily solvable system of
               equations, and accelerate the associated iteration based on
               this splitting by (nonlinear) conjugate gradients. The
               behavior of the method is illustrated for the minimal surface
               equation with splittings corresponding to nonlinear SSOR, to
               approximate factorization of the Jacobian matrix, and to
               elliptic operators suitable for use with fast direct methods.
               The results of numerical experiments are given as well for a
               mildly nonlinear example, for which, in the corresponding
               linear case, the finite termination property of the conjugate
               gradient algorithm is crucial.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/585/CS-TR-76-585.pdf

%R CS-TR-76-586
%Z Tue, 04 Jul 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The evolution of programs: a system for automatic program
               modification
%A Dershowitz, Nachum
%A Manna, Z ohar
%D December 1976
%X An attempt is made to formulate techniques of program
               modification, whereby a program that achieves one result can
               be transformed into a new program that uses the same
               principles to achieve a different goal. For example, a
               program that uses the binary search paradigm to calculate the
               square-root of a number may be modified to divide two numbers
               in a similar manner, or vice versa.
               Program debugging is considered as a special case of
               modification: if a program computes wrong results, it must be
               modified to achieve the intended results. The application of
               abstract program schemata to concrete problems is also viewed
               from the perspective of modification techniques.
               We have embedded this approach in a running implementation;
               our methods are illustrated with several examples that have
               been performed by it.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/586/CS-TR-76-586.pdf

%R CS-TR-76-405
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stanford Computer Science Department research report.
%A Davis, Randall
%A Wright, Margaret H.
%D January 1976
%X This collection of reports is divided into two sections. The
               first contains the research summaries for individual faculty
               members and research associates in the Computer Science
               Department. Two professors from Electrical Engineering are
               included as "Affiliated Faculty" because their interests are
               closely related to those of the Department.
               The second section gives an overview of the activities of
               research groups in the Department. "Group" here is taken to
               imply many different things, including people related by
               various degrees of intellectual interests, physical
               proximity, or funding considerations. We have tried to
               describe any group whose scope of interest is greater than
               that of one person. The list of recent publications for each
               is not intended to be comprehensive, but rather to give a
               feeling for the range of topics considered.
               This collection of reports has been assembled to provide a
               reasonably comprehensive review of research activities in the
               Department. We hope that it will be widely useful -- in
               particular, students in the Department may find it helpful in
               discovering interesting projects and possible thesis topics.
               We expect also that it will be of interest to many other
               people, both within and outside the Department.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/76/405/CS-TR-76-405.pdf

%R CS-TR-74-404
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A catalog of quadri/trivalent graphs.
%A Sridharan, Natesa S.
%D January 1974
%X In a previous report [1973] a method for computer generation
               of quadri/trivalent "vertex-graphs" was presented in detail.
               This report is a catalog of 13 classes of graphs generated by
               using this method.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/404/CS-TR-74-404.pdf

%R CS-TR-74-405
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stanford Computer Science Department research report.
%A Davis, Randall
%A Wright, Margaret H.
%D January 1974
%X This collection of reports is divided into two sections. The
               first contains the research summaries for individual faculty
               members and research associates in the Computer Science
               Department. Two professors from Electrical Engineering are
               included as "Affiliated Faculty' because their interests are
               closely related to those of the Department, while Professors
               George Dantzig and Roger Schank do not appear because they
               were on leave and unavailable when the summaries were
               prepared.
               The second section gives an overview of the activities of
               research groups in the Department. "Group" here is taken to
               imply many different things, including people related by
               various degrees of intellectual interests, physical
               proximity, or funding considerations. We have tried to
               describe any group whose scope of interest is greater than
               that of one person. The list of recent publications for each
               is not intended to be comprehensive, but rather to give a
               feeliny for the range of topics considered.
               This collection of reports has been assembled to provide a
               reasonably comprehensive review of research activities in the
               Department. We hope that it will be widely useful -- in
               particular, students in the Department may find it helpful in
               discovering interesting projects and possible thesis topics.
               We expect also that it will be of interest to many other
               people, both within and outside the Department.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/405/CS-TR-74-405.pdf

%R CS-TR-74-406
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Memory model for a robot.
%A Perkins, W. A.
%D January 1974
%X A memory model for a robot has been designed and tested in a
               simple toy-block world for which it has shown clarity,
               efficiency, and generality. In a constrained pseudo-English
               one can ask the program to manipulate objects and query it
               about the present, past, and possible future states of its
               world. The program has a good understanding of its world and
               gives intelligent answers in reasonably good English. Past
               and hypothetical states of the world are handled by changing
               the state of the world in an imaginary context. Procedures
               interrogate and modify two global databases, one which
               contains the present representation of the world and another
               which contains the past history of events, conversations,
               etc. The program has the ability to create, destroy, and even
               resurrect objects in its world.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/406/CS-TR-74-406.pdf

%R CS-TR-74-407
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T FAIL.
%A Wright, F. H. G., II
%A Gorin, Ralph E.
%D April 1974
%X This is a reference manual for FAIL, a fast, one-pass
               assembler for PDP-10 and PDP-6 machine language. FAIL
               statements, pseudo-operations, macros, and conditional
               assembly features are described. Although FAIL uses
               substantially more main memory than MACRO-10, it assembles
               typical programs about five time faster. FAIL assembles the
               entire Stanford time-sharing operating system (two million
               characters) in less than four minutes of CPU time on a KA-10
               processor. FAIL permits an ALGOL-type block structure which
               provides a way of localizing the usage of some symbols to
               certain parts of the program, such that the same symbol name
               can be used to mean different things in different blocks.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/407/CS-TR-74-407.pdf

%R CS-TR-74-409
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Final report: the first ten years of artificial intelligence
               research at Stanford.
%A Earnest, Lester D.
%A McCarthy, John
%A Feigenbaum, Edward A.
%A Lederberg, Joshua
%D July 1973
%X The first ten years of research in artificial intelligence
               and related fields at Stanford University have yielded
               significant results in computer vision and control of
               manipulators, speech recognition, heuristic programming,
               representation theory, mathematical theory of computation,
               and modeling of organic chemical processes. This report
               summarizes the accomplishments and provides bibliographies in
               each research area.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/409/CS-TR-74-409.pdf

%R CS-TR-74-411
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T After Leibniz...: discussions on philosophy and artificial
               intelligence.
%A Anderson, D. Bruce
%A Binford, Thomas O.
%A Thomas, Arthur J.
%A Weyhrauch, Richard W.
%A Wilks, Yorick A.
%D March 1974
%X This is an edited transcript of informal conversations which
               we have had over recent months, in which we looked at some of
               the issues which seem to arise when artificial intelligence
               and philosophy meet. Our aim was to see what might be some of
               the fundamental principles of attempts to build intelligent
               machines. The major topics covered are the relationship of AI
               and philosophy and what help they might be to each other: the
               mechanisms of natural inference and deduction; the question
               of what kind of theory of meaning would be involved in a
               successful natural language understanding program, and the
               nature of models in AI research.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/411/CS-TR-74-411.pdf

%R CS-TR-74-414
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T GEOMED - a geometric editor.
%A Baumgart, Bruce G.
%D May 1974
%X GEOMED is a system for doing 3-D geometric modeling; used
               from a keyboard, it is an interactive drawing program; used
               as a package of SAIL or LISP accessible subroutines, it is a
               graphics language. With GEOMED, arbitrary polyhedra can be
               constructed, moved about and viewed in perspective with
               hidden lines eliminated. In addition to polyhedra, camera and
               image models are provided so that simulators relevant to
               computer vision, problem solving, and animation may be
               constructed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/414/CS-TR-74-414.pdf

%R CS-TR-74-417
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Some thoughts on proving clean termination of programs.
%A Sites, Richard L.
%D May 1974
%X Proof of clean termination is a useful sub-goal in the
               process of proving that a program is totally correct. Clean
               termination means that the program terminates (no infinite
               loops) and that it does so normally, without any
               execution-time semantic errors (integer overflow, use of
               undefined variables, subscript out of range, etc.). In
               contrast to proofs of correctness, proof of clean termination
               requires no extensive annotation of a program by a human
               user, but the proof says nothing about the results calculated
               by the program, just that whatever it does, it terminates
               cleanly. Two example proofs are given, of previously
               published programs: TREESORT3 by Robert Floyd, and SELECT by
               Ronald L. Rivest and Robert Floyd.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/417/CS-TR-74-417.pdf

%R CS-TR-74-420
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Partially self-checking ciruits and their use in performing
               logical operations.
%A Wakerly, John F.
%D August 1973
%X A new class of circuits called partially self-checking
               circuits is described. These circuits have one mode of
               operation called secure mode in which they have the
               properties of totally self-checking circuits; that is, every
               fault is tested during normal operation and no fault can
               cause an undetected error. They also have an insecure mode of
               operation with the property that any fault which affects a
               result in insecure mode is tested by some input in secure
               mode; however, undetected errors may occur in insecure mode.
               One application of these circuits is in the arithmetic and
               logic unit of a computer with data encoded in an
               error-detecting code. While there is no code simpler than
               duplication which detects single errors in logical operations
               such as AND and OR, it is shown that there exist partially
               self-checking networks to perform these operations. A
               commercially available MSI chip, the 74181 4-bit ALU, can be
               used in a partially self-checking network to perform
               arithmetic and logical operations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/420/CS-TR-74-420.pdf

%R CS-TR-74-423
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Asymptotic representation of the average number of active
               modules in an n-way interleaved memory.
%A Rao, Gururaj S.
%D April 1974
%X In an n-way interleaved memory the effective bandwidth
               depends on the average number of concurrently active modules.
               Using a model for the memory which does not permit queueing
               on busy modules and which assumes an infinite stream of calls
               on the modules, where the elements in the stream occur with
               equal probability, the average number is a combinatorial
               quantity. Hellerman has previously app oximated this quantity
               by $n^{0.56}$.
               We show in this paper that the average number is
               asymptotically equal to $sqrt{\frac{\pi n}{2}} -
               \frac{1}{3}$. The method is due to Knuth and expresses the
               combinatorial quantity in terms of the incomplete gamma
               function and its deriviatives.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/423/CS-TR-74-423.pdf

%R CS-TR-74-431
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Pattern-matching rules for the recognition of natural
               language dialogue expressions.
%A Colby, Kenneth Mark
%A Parkison, Roger C.
%A Faught, William S.
%D June 1974
%X Man-machine dialogues using everyday conversational English
               present problems for computer processing of natural language.
               Grammar-based parsers which perform a word-by-word,
               parts-of-speech analysis are too fragile to operate
               satisfactorily in real time intervieus allowing unrestricted
               English. In constructing a simulation of paranoid thought
               processes, we designed an algorithm capable of handling the
               linguistic expressions used by interviewers in teletyped
               diagnostic psychiatric interviews. The algorithm uses
               pattern-matching rules which attempt to characterize the
               input expressions by progressively transforming them into
               patterns uhich match, completely or fuzzily, abstract stored
               patterns. The power of this approach lies in its ability to
               ignore recognized and unrecognized words and still grasp the
               meaning of the message. The methods utilized are general and
               could serve any "host" system uhich takes natural language
               input.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/431/CS-TR-74-431.pdf

%R CS-TR-74-433
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On automating the construction of programs.
%A Buchanan, Jack R.
%A Luckham, David C.
%D May 1974
%X An experimental system for automatically generating certain
               simple kinds of programs is described. The programs
               constructed are expressed in a subset of ALGOL containing
               assignments, function calls, conditional statements, while
               loops, and non-recursive procedure calls. The input is an
               environment of primitive programs and programming methods
               specified in a lnaugage currently used to define the
               semantics of the output programming language. The system has
               been used to generate programs for symbolic manipulation,
               robot control, every day planning, and computing arithmetical
               functions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/433/CS-TR-74-433.pdf

%R CS-TR-74-435
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Balanced computer systems.
%A Price, Thomas G.
%D April 1974
%X We use the central server model to extend Buzen's results on
               balance and bottlenecks. We develop two measures which appear
               to be useful for evaluating and improving computer system
               performance. The first measure, called the balance index, is
               useful for balancing requests to the peripheral processors.
               The second quantity, called the sensitivity index, indicates
               which processing rates have the most effect on overall system
               performance.
               We define the capacity of a central server model as the
               maximum throughput as we vary the peripheral processor
               probabilities. We show that the reciprocal of the CPU
               utilization is a convex function of the peripheral processor
               probabilities and that a necessary and sufficient condition
               for the peripheral processor probabilities to achieve
               capacity is that the balance indexes are equal for all
               peripheral processors. We give a method to calculate capacity
               using classical optimization techniques.
               Finally, we consider the problem of balancing the processing
               rates of the processors. Two conditions for "balance" are
               derived. The first condition maximizes our uncertainty about
               the next state of the system. This condition has several
               desirable properties concerning throughput, utilizations,
               overlap, and resistance to changes in job mix. The second
               condition is based on obtaining the most throughput for a
               given cost.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/435/CS-TR-74-435.pdf

%R CS-TR-74-436
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Natural language understanding systems within the AI
               paradigm: a survey and some comparisons.
%A Wilks, Yorick A.
%D December 1974
%X The paper surveys the major projects on the understanding of
               natural language that fall within what may now be called the
               artificial intelligence paradigm for natural language
               systems. Some space is devoted to arguing that the paradigm
               is now a reality and different in significant respects from
               the generative paradigm of present day linguistics. The
               comparisons between systems center around questions of the
               relative perspicuity of procedural and static
               representations; the advantages and disadvantages of
               developing systems over a period survey and some comparisons.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/436/CS-TR-74-436.pdf

%R CS-TR-74-439
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the solution of large, structured linear complementarity
               problems: III.
%A Cottle, Richard W.
%A Golub, Gene H.
%A Sacher, Richard S.
%D August 1974
%X This paper addresses the problem of solving a class of
               specially-structured linear complementarity problems of
               potentially very large size. An efficient method which
               couples a modification of the block successive overrelaxation
               technique and several techniques discussed by the authors in
               previous papers is proposed. Problems of the type considered
               arise, for example, in solving approximations to both the
               free boundary problem for finite-length journal bearings and
               percolation problems in porous dams by numerical methods.
               These applications and our computational experience with the
               method are presented here.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/439/CS-TR-74-439.pdf

%R CS-TR-74-442
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Estimating the efficiency of backtrack programs.
%A Knuth, Donald E.
%D August 1974
%X One of the chief difficulties associated with the so-called
               backtracking technique for combinatorial problems has been
               our inability to predict the efficiency of a given algorithm,
               or to compare the efficiencies of different approaches,
               without actually writing and running the programs. This paper
               presents a simple method which produces reasonable estimates
               for most applications, requiring only a modest amount of hand
               calculation. The method should prove to be of considerable
               utility in connection with D. H. Lehmer's branch-and-bound
               approach to combinatorial optimization.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/442/CS-TR-74-442.pdf

%R CS-TR-74-444
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Progress report on program-understanding systems.
%A Green, C. Cordell
%A Waldinger, Richard J.
%A Barstow, David R.
%A Elschlager, Robert A.
%A Lenat, Douglas B.
%A McCune, Brian P.
%A Shaw, David E.
%A Steinberg, Louis I.
%D August 1974
%X This progress report covers the first year and one half of
               work by our automatic-programming research group at the
               Stanford Artificial Intelligence Laboratory. Major emphasis
               has been placed on methods of program specification,
               codification of programming knowledge, and implementation of
               pilot systems for program writing and understanding. List
               processing has been used as the general problem domain for
               this work.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/444/CS-TR-74-444.pdf

%R CS-TR-74-446
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T LCFsmall: an implementation of LCF.
%A Aiello, Luigia
%A Weyhrauch, Richard W.
%D August 1974
%X This is a report on a computer program implementing a
               simplified version of LCF. It is written (with minor
               exceptions) entirely in pure LISP and has none of the user
               oriented features of the implementation described by Milner.
               We attempt to represent directly in code the metamathematical
               notions necessary to describe LCF. We hope that the code is
               simple enough and the metamathematics is clear enough so that
               properties of this particular program (e.g. its correctness)
               can eventually be proved. The program is reproduced in full.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/446/CS-TR-74-446.pdf

%R CS-TR-74-447
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The semantics of PASCAL in LCF.
%A Aiello, Luigia
%A Aiello, Mario
%A Weyhrauch, Richard W.
%D August 1974
%X We define a semantics for the arithmetic part of PASCAL by
               giving it an interpretation in LCF, a language based on the
               typed $\lambda$-calculus. Programs are represented in terms
               of their abstract syntax. We show sample proofs, using LCF,
               of some general properties of PASCAL and the correctness of
               some particular programs. A program implementing the McCarthy
               Airline reservation system is proved correct.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/447/CS-TR-74-447.pdf

%R CS-TR-74-455
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Edge-disjoint spanning trees, dominators, and depth-first
               search.
%A Tarjan, Robert Endre
%D September 1974
%X This paper presents an algorithm for finding two
               edge-disjoint spanning trees rooted at a fixed vertex of a
               directed graph. The algorithm uses depth-first search, an
               efficient method for computing disjoint set unions, and an
               efficient method for computing dominators. It requires O(V
               log V + E) time and O(V + E) space to analyze a graph with V
               vertices and E edges.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/455/CS-TR-74-455.pdf

%R CS-TR-74-456
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T AL, a programming system for automation.
%A Finkel, Raphael A.
%A Taylor, Russell H.
%A Bolles, Robert C.
%A Paul, Richard P.
%A Feldman, Jerome A.
%D November 1974
%X AL is a high-level programming system for specification of
               manipulatory tasks such as assembly of an object from parts.
               AL includes an ALGOL-like source language, a translator for
               converting programs into runnable code, and a runtime system
               for controlling manipulators and other devices. The system
               includes advanced features for describing individual motions
               of manipulators, for using sensory information, and for
               describing assembly algorithms in terms of common
               domain-specific primitives. This document describes the
               design of AL, which is currently being implemented as a
               successor to the Stanford WAVE system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/456/CS-TR-74-456.pdf

%R CS-TR-74-457
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Ten criticisms of PARRY.
%A Colby, Kenneth Mark
%D September 1974
%X Some major criticisms of a computer simulation of paranoid
               processes (PARRY) are reviewed and discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/457/CS-TR-74-457.pdf

%R CS-TR-74-460
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Random insertion into a priority queue structure.
%A Porter, Thomas
%A Simon, Istvan
%D October 1974
%X The average number of levels that a new element moves up when
               inserted into a heap is investigated. Two probabilistic
               models, under which such an average might be computed are
               proposed. A "lemma of conservation of ignorance" is
               formulated and used in the derivation of an exact formula for
               the average in one of these models. It is shown that this
               average is bounded by a constant and its asymptotic behavior
               is discussed. Numerical data for the second model is also
               provided and analyzed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/460/CS-TR-74-460.pdf

%R CS-TR-74-462
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A fast, feature-driven stereo depth program.
%A Pingle, Karl K.
%A Thomas, Arthur J.
%D May 1975
%X In this paper we describe a fast, feature-driven program for
               extracting depth information from stereoscopic sets of
               digitized TV images. This is achieved by two means: in the
               simplest case, by statistically correlating variable-sized
               windows on the basis of visual texture, and in the more
               complex case by pre-processing the images to extract
               significant visual features such as corners, and then using
               these features to control the correlation process.
               The program runs on the PDP-10 but uses a PDP-11/45 and an
               SPS-41 Signal Processing Computer as subsidiary processors.
               The use of the two small, fast machines for the performance
               of simple but often-repeated computations effects an increase
               in speed sufficient to allow us to think of using this
               program as a fast 3-dimensional segmentation method,
               preparatory to more complex image processing. It is also
               intended for use in visual feedback tasks involved in
               hand-eye coordination and automated assembly. The current
               program is able to calculate the three-dimensional positions
               of 10 points to within 5 millimeters, using 5 seconds of
               computation for extracting features, 1 second per image for
               correlation, and 0.1 second for the depth calculation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/462/CS-TR-74-462.pdf

%R CS-TR-74-466
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Recent research in artificial intelligence, heuristic
               programming, and network protocols.
%A Earnest, Lester D.
%A McCarthy, John
%A Feigenbaum, Edward A.
%A Lederberg, Joshua
%A Cerf, Vinton G.
%D July 1974
%X This is a progress report for ARPA-sponsored research
               projects in computer science for the period July 1973 to July
               1974. Accomplishments are reported in artificial intelligence
               (especially heuristic programming, robotics, theorem proving,
               automatic programming, and natural language understanding),
               mathematical theory of computation, and protocol development
               for computer communication networks. References to recent
               publications are provided for each topic.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/466/CS-TR-74-466.pdf

%R CS-TR-74-467
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Checking proofs in the metamathematics of first order logic.
%A Aiello, Mario
%A Weyhrauch, Richard W.
%D August 1974
%X This is a report on some of the first experiments of any size
               carried out using the new first order proof checker FOL. We
               present two different first order axiomatizations of the
               metamathematics of the logic which FOL itself checks and show
               several proofs using each one. The difference between the
               axiomatizations is that one defines the metamathematics in a
               many sorted logic, the other does not.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/467/CS-TR-74-467.pdf

%R CS-TR-74-468
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A combinatorial base for some optimal matroid intersection
               algorithms.
%A Krogdahl, Stein
%D November 1974
%X E. Lawler has given an algorithm for finding maximum weight
               intersections for a pair of matroids, using linear
               programming concepts and constructions to prove its
               correctness. In this paper another theoretical base for this
               algorithm is given which depends only on the basic properties
               of matroids, and which involves no linear programming
               concepts.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/468/CS-TR-74-468.pdf

%R CS-TR-74-469
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Molecular structure elucidation III.
%A Brown, Harold
%D December 1974
%X A computer implemented algorithm to solve the following graph
               theoretical problem is presented: given the empirical formula
               for a molecule and one or more non-overlapping substructural
               fragments of the molecule, determine all the distinct
               molecular structures based on the formula and containing the
               fragments. That is, given a degree sequence of labeled nodes
               and one or more connected multigraphs, determine a
               representative set of the isomorphism classes of the
               connected multigraphs based on the degree sequence and
               containing the given multi-graphs as non-overlapping
               subgraphs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/469/CS-TR-74-469.pdf

%R CS-TR-74-470
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stable sorting and merging with optimal space and time
               bounds.
%A Trabb-Pardo, Luis I.
%D December 1974
%X This work introduces two algorithms for stable merging and
               stable sorting of files.
               The algorithms have optimal worst case time bounds, the merge
               is linear and the sort is of order n log n. Extra storage
               requirements are also optimal, since both algorithms make use
               of a fixed number of pointers. Files are handled only by
               means of the primitives exchange and comparison of records
               and basic pointer transformations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/470/CS-TR-74-470.pdf

%R CS-TR-74-471
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The interaction of inferences, affects, and intentions, in a
               model of paranoia.
%A Faught, William S.
%A Colby, Kenneth Mark
%A Parkison, Roger C.
%D December 1974
%X The analysis of natural language input into its underlying
               semantic content is but one of the tasks necessary for a
               system (human or non-human) to use natural language.
               Responding to natural language input requires performing a
               number of tasks: 1) deriving facts about the input and the
               situation in which it was spoken; 2) attending to the
               system's needs, desires, and interests; 3) choosing
               intentions to fulfill these interests; 4) deriving and
               executing actions from these intentions. We describe a series
               of processes in a model of paranoia which performs these
               tasks. We also describe the modifications made by the
               paranoid processes to the normal processes. A computer
               program has been constructed to testst this theory.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/471/CS-TR-74-471.pdf

%R CS-TR-74-472
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stanford automatic photogrammetry research.
%A Quam, Lynn H.
%A Hannah, Marsha Jo
%D December 1974
%X This report documents the feasiblity study done at Stanford
               University's Artificial Intelligence Laboratory on the
               problem of computer automated aerial/orbital photogrammetry.
               The techniques investigated were based on correlation
               matching of small areas in digitized pairs of stereo images
               taken from high altitude or planetary orbit, with the
               objective of deriving a 3-dimensional model for the surface
               of a planet.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/472/CS-TR-74-472.pdf

%R CS-TR-74-473
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automatic program verification II: verifying programs by
               algebraic and logical reduction.
%A Suzuki, Norihisa
%D December 1974
%X Methods for verifying programs written in a higher level
               programming language are devised and implemented. The system
               can verify programs written in a subset of PASCAL, which may
               have data structures and control structures such as WHILE,
               REPEAT, FOR, PROCEDURE, FUNCTION and COROUTINE. The process
               of creation of verification conditions is an extension of the
               work done by Igarashi, London and Luckham which is based on
               the deductive theory by Hoare. Verification conditions are
               proved using specialized simplification and proof techniques,
               which consist of an arithmetic simplifier, equality
               replacement rules, fast algorithm for simplifying formulas
               using propositional truth value evaluation, and a depth first
               proof search process. The basis of deduction mechanism used
               in this prover is Gentzen-type formal system. Several sorting
               programs including Floyd's TREESORT3 and Hoare's FIND are
               verified. It is shown that the resulting array is not only
               well-ordered but also a permutation of the input array.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/473/CS-TR-74-473.pdf

%R CS-TR-74-474
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automatic program verification III: a methodology for
               verifying programs.
%A von Henke, Friedrich W.
%A Luckham, David C.
%D December 1974
%X The paper investigates methods for applying an on-line
               interactive verification system designed to prove properties
               of PASCAL programs. The methodology is intended to provide
               techniques for developing a debugged and verified version
               starting from a program, that (a) is possibly unfinished in
               some respects, (b) may not satisfy the given specifications,
               e.g., may contain bugs, (c) may have incomplete
               documentation, (d) may be written in non-standard ways, e.g.,
               may depend on user-defined data structures.
               The methodology involves (i) interactive application of a
               verification condition generator, an algebraic simplifier and
               a theorem-prover; (ii) techniques for describlng data
               structures, type constraints, and properties of programs and
               subprograms (i.e. lower level procedures); (iii) the use of
               (abstract) data types in structuring programs and proofs.
               Within each unit (i.e. segment of a problem), the interactive
               use is aimed at reduclng verification conditions to
               manageable proportions so that the non-trivial factors may be
               analysed. Analysis of verification conditions attempts to
               localize errors in the program logic, to extend assertions
               inside the program, to spotlight additional assumptions on
               program subfunctions (beyond those already specified by the
               programmer), and to generate appropriate lemmas that allow a
               verification to be completed. Methods for structuring
               correctness proofs are discussed that are similar to those of
               "structured programming".
               A detailed case study of a pattern matching algorithm
               illustrating the various aspects of the methodology
               (including the role played by the user) is given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/74/474/CS-TR-74-474.pdf

%R CS-TR-75-476
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A hypothetical dialogue exhibiting a knowledge base for a
               program-understanding system.
%A Green, C. Cordell
%A Barstow, David R.
%D January 1975
%X A hypothetical dialogue with a fictitious
               program-understanding system is presented. In the interactive
               dialogue the computer carries out a detailed synthesis of a
               simple insertion sort program for linked lists. The content,
               length and complexity of the dialogue reflect the underlying
               programming knowledge which would be required for a system to
               accomplish this task. The nature of the knowledge is
               discussed and the codification of such programming knowledge
               is suggested as a major research area in the development of
               program-understanding systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/476/CS-TR-75-476.pdf

%R CS-TR-75-477
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Longest common subsequences of two random sequences.
%A Chvatal, Vaclav
%A Sankoff, David
%D January 1975
%X Given two random k-ary sequences of length n, what is f(n,k),
               the expected length of their longest common subsequence? This
               problem arises in the study of molecular evolution. We
               calculate f(n,k) for all k, where n $\leq$ 5 , and f(n,2)
               where n $\leq$ 10. We study the limiting behavior of
               $n^{-1}$f(n,k) and derive upper and lower bounds on these
               limits for all k. Finally we estimate by Monte-Carlo methods
               f(100,k), f(1000,2) and f(5000,2).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/477/CS-TR-75-477.pdf

%R CS-TR-75-478
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Ill-conditioned eigensystems and the computation of the
               Jordan canonical form.
%A Golub, Gene H.
%A Wilkinson, James H.
%D February 1975
%X The solution of the complete eigenvalue problem for a
               non-normal matrix A presents severe practical difficulties
               when A is defective or close to a defective matrix. However
               in the presence of rounding errors one cannot even determine
               whether or not a matrix is defective. Several of the more
               stable methods for computing the Jordan canonical form are
               discussed together with the alternative approach of computing
               well-defined bases (usually orthogonal) of the relevant
               invariant subspaces.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/478/CS-TR-75-478.pdf

%R CS-TR-75-479
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Error bounds in the approximation of eigenvalues of
               differential and integral operators.
%A Chatelin, Francois
%A Lemordant, J.
%D February 1975
%X Various methods of approximating the eigenvalues and
               invariant subspaces of nonself-adjoint differential and
               integral operators are unified in a general theory. Error
               bounds are given, from which most of the error bounds in the
               literature can be derived.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/479/CS-TR-75-479.pdf

%R CS-TR-75-481
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Hybrid difference methods for the initial boundary-value
               problem for hyperbolic equations.
%A Oliger, Joseph E.
%D February 1975
%X The use of lower order approximations in the neighborhood of
               boundaries coupled with higher order interior approximations
               is examined for the mixed initial boundary-value problem for
               hyperbolic partial differential equations. Uniform error can
               be maintained using smaller grid intervals with the lower
               order approximations near the boundaries. Stability results
               are presented for approximations to the initial
               boundary-value problem for the model equation $u_t$ +
               ${cu}_x$ = O which are fourth order in space and second order
               in time in the interior and second order in both space and
               time near the boundaries. These results are generalized to a
               class of methods of this type for hyperbolic systems.
               Computational results are presented and comparisons are made
               with other methods.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/481/CS-TR-75-481.pdf

%R CS-TR-75-483
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On packing squares with equal squares.
%A Erdoes, Paul
%A Graham, Ronald L.
%D March 1975
%X The following problem arises in connection with certain
               multi-dimensional stock cutting problems:
               How many non-overlapping open unit squares may be packed into
               a large square of side $\alpha$?
               Of course, if $\alpha$ is a positive integer, it is trivial
               to see that unit squares ean be successfully packed. However,
               if $\alpha$ is not an integer, the problem beeomes much more
               complicated. Intuitively, one feels that for $\alpha$ = N +
               1/100, say, (where N is an integer), one should pack $N^2$
               unit squares in the obvious way and surrender the uncovered
               border area (which is about $\alpha$/50) as unusable waste.
               After all, how could it help to place the unit squares at all
               sorts of various skew angles?
               In this note, we show how it helps. In particular, we prove
               that we can always keep the amount of uncovered area down to
               at most proportional to ${\alpha}^{7/11}$, which for large
               $\alpha$ is much less than the linear waste produced by the
               "natural" packing above.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/483/CS-TR-75-483.pdf

%R CS-TR-75-484
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On subgraph number independence in trees.
%A Graham, Ronald L.
%A Szemeredi, Endre
%D March 1975
%X For finite graphs F and G, let $N_F$(G) denote the number of
               occurrences of F in G, i.e., the number of subgraphs of G
               which are isomorphic to F. If ${\cal F}$ and ${\cal G}$ are
               families of graphs, it is natural to ask them whether or not
               the quantities $N_F$(G), $F \in {\cal F}$, are linearly
               independent when G is restricted to ${\cal G}$. For example,
               if ${\cal F}$ = {$K_1$,$K_2$} (where $K_n$ denotes the
               complete graph on n vertices) and ${\cal G}$ is the family of
               all (finite) $\underline{trees}$ then of course
               $N_{K_{1}}$(T) - $N_{K_{2}}$(T) = 1 for all $T \in {\cal G}$.
               Slightly less trivially, if ${\cal F} = {$S_n$: n =
               1,2,3,...} (where $S_n$ denotes the $\underline{star}$ on n
               edges) and ${\cal G}$ again is the family of all trees then
               $\sum_{n-1}^{\infty} {(-1)}^{n+1} N_{S_{n}} (T) = 1 for all T
               \in {\cal G}$.
               It will be proved that such a linear dependence can
               $\underline{never}$ occur if ${\cal F}$ is finite, no $F \in
               {\cal F}$ has an isolated point and ${\cal G}$ contains all
               trees. This result has important applications in recent work
               of L. Lovasz and one of the authors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/484/CS-TR-75-484.pdf

%R CS-TR-75-485
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On multiplicative representations of integers.
%A Erdoes, Paul
%A Szemeredi, Endre
%D March 1975
%X In 1969 it was shown by P. Erdoes that if 0 < $a_1$ < $a_2$ <
               ... < $a_k \leq x$ is a sequence of integers for which the
               products $a_i a_j$ are all distinct then the maximum possible
               value of k satisfies
               $\pi$(x) + $c_2$ $x^{3/4}$/${(log x)}^{3/2}$ < max k <
               $\pi$(x) + $c_1$ $x^{3/4}$/$(log x)^{3/2}$
               where $\pi$(x) denotes the number of primes not exceeding x
               and $c_1$ and $c_2$ are absolute constants.
               In this paper we will be concerned with similar results of
               the following type. Suppose 0 < $a_1$ < ... < $a_k \leq x$, 0
               < $b_1$ < ... < $b_{\ell} \leq x$ are sequences of integers.
               Let g(n) denote the number of representations of n in the
               form $a_i b_j$. Then we prove:
               (i) If g(n) $\leq$ 1 for all n then for some constant $c_3$,
               k$\ell$ < $c_3 x^2$/log x.
               (ii) For every c there is an f(c) so that if g(n) $\leq$ c
               for all n then for some constant $c_4$,
               k$\ell$ < $c_4 x^2$/log x ${(log log x)}^{f(c)}.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/485/CS-TR-75-485.pdf

%R CS-TR-75-486
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Eigenproblems for matrices associated with periodic boundary
               conditions.
%A Bjorck, Ake
%A Golub, Gene H.
%D March 1975
%X A survey of algorithms for solving the eigenproblem for a
               class of matrices of nearly tridiagonal form is given. These
               matrices arise from eigenvalue problems for differentia1
               equations where the solution is subject to periodic boundary
               conditions. Algorithms both for computing selected
               eigenvalues and eigenvectors and for solving the complete
               eigenvalue problem are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/486/CS-TR-75-486.pdf

%R CS-TR-75-488
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On complete subgraphs of r-chromatic graphs.
%A Bollabas, Bela
%A Erdoes, Paul
%A Szemeredi, Endre
%D April 1975
%X Denote by G(p,q) a graph of p vertices and q edges. $K_r$ =
               G(r,($(^{r}_{2}$)) is the complete graph with r vertices and
               $K_r$(t) is the complete r chromatic (i.e., r-partite) graph
               with t vertices in each color class. $G_r$(n) denotes an
               r-chromatic graph, and $\delta$(G) is the minimal degree of a
               vertex of graph G. Furthermore denote by $f_r$(n) the
               smalleest integer so that every $G_r$(n) with
               $\delta${$G_r$(n)) > $f_r$(n) contains a $K_r$. It is easy to
               see that $\lim_{n \rightarrow \infty} f_r$(n)/n = $c_r$
               exists. We show that $c_4 \geq$ 2 + 1/9 and $c_r \geq$ r-2 +
               1/2 - $\frac{1}{2(r-2)}$ for r > 4. We prove that if
               $\delta${$G_3$(n)) $\geq$ n+t then G contains at least $t^3$
               triangles but does not have to contain more than 4$t^3$ of
               them.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/488/CS-TR-75-488.pdf

%R CS-TR-75-489
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Regular partitions of graphs.
%A Szemeredi, Endre
%D April 1975
%X A crucial lemma in recent work of the author (showing that
               k-term arithmetic progression-free sets of integers must have
               density zero) stated (approximately) that any large bipartite
               graph can be decomposed into relatively few "nearly regular"
               bipartite subgraphs. In this note we generalize this result
               to arbitrary graphs, at the same time strengthening and
               simplifying the original bipartite result.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/489/CS-TR-75-489.pdf

%R CS-TR-75-490
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Numerical experiments with the spectral test.
%A Gosper, R. William
%D May 1975
%X Following Marsaglia and Dieter, the spectral test for linear
               congruential random number generators is developed from the
               grid or lattice point model rather than the Fourier transform
               model. Several modifications to the published algorithms were
               tried. One of these refinements, which uses results from
               lesser dimensions to compute higher dimensional ones, was
               found to decrease the computation time substantially. A
               change in the definition of the spectral test is proposed in
               the section entitled "A Question of Independence."
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/490/CS-TR-75-490.pdf

%R CS-TR-75-493
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Describing automata in terms of languages associated with
               their peripheral devices.
%A Kurki-Suonio, Reino
%D May 1975
%X A unified approach is presented to deal with automata having
               different kinds of peripheral devices. This approach is
               applied to pushdown automata and Turing machines, leading to
               elementary proofs of several well-known theorems concerning
               transductions, relationship between pushdown automata and
               context-free languages, as well as homomorphic
               characterization and undecidability questions. In general,
               this approach leads to homomorphic characterization of
               language families generated by a single language by finite
               transduction.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/493/CS-TR-75-493.pdf

%R CS-TR-75-500
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Towards better structured definitions of programming
               languages.
%A Kurki-Suonio, Reino
%D September 1975
%X The use of abstract syntax and a behavioral model is
               discussed from the viewpoint of structuring the complexity in
               definitions of programming languages. A formalism for
               abstract syntax is presented which reflects the possibility
               of having one defining occurrence and an arbitrary number of
               applied occurrences of objects. Attributes can be associated
               with such a syntax for restricting the set of objects
               generated, and for defining character string representations
               and semantic interpretations for the objects. A system of
               co-operating automata, described by another abstract syntax,
               is proposed as a behavioral model for semantic definition.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/500/CS-TR-75-500.pdf

%R CS-TR-75-501
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Procedural events as software interrupts.
%A Pettersen, Odd
%D June 1975
%X The paper deals with procedural events, providing a basis for
               synchronization and scheduling, particularly applied on
               real-time program systems of multiple parallel activities
               ("multi-task").
               There is a great need for convenient scheduling mechanisms
               for minicomputer systems as used in process control, but so
               far mechanisms somewhat similar to those proposed here are
               found only in PL/I among the generally known high-level
               languages. PL/I, however, is not very common on computers of
               this size. Also, the mechanisms in PL/I seem more restricted,
               as campared to those proposed here.
               A new type of boolean program variable, the EVENTMARK, is
               proposed. Eventmarks represent events of any kind that may
               occur within a computational process and are believed to give
               very efficient and convenient activation and scheduling of
               program modules in a real-time system. An eventmark is
               declared similar to a procedure, and the proposed feature
               could easily be amended as an extension to existing
               languages, as well as incorporated in future language
               designs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/501/CS-TR-75-501.pdf

%R CS-TR-75-502
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Synchronization of concurrent processes.
%A Pettersen, Odd
%D July 1975
%X The paper gives an overview of commonly used synchronization
               primitives and literature, and presents a new form of
               primitive expressing conditional critical regions.
               A new solution is presented to the problem of "readers and
               writers", utilizing the proposed synchronization primitive.
               The solution is simpler and shorter than other known
               algorithms. The first sections of the paper give a tutorial
               introduction into established methods, in order to provide a
               suitable background for the remaining parts.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/502/CS-TR-75-502.pdf

%R CS-TR-75-503
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The macro processing system STAGE2: transfer of comments to
               the generated text.
%A Pettersen, Odd
%D July 1975
%X This paper is a short description of a small extension of
               STAGE2, providing possibilities to copy comments etc. from
               the source text to the generated text. The description
               presupposes familiarity with the STAGE2 system: its purpose,
               use and descriptions. Only section 3 of this paper requires
               knowledge of the internal structures and working of the
               system, and that section is unnecessary for the plain use of
               the described feature. The extension, if not used, is
               completely invisible to the user: No rules, as described in
               the original literature, are changed. A user, unaware of the
               extension, will see no difference from the original version.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/503/CS-TR-75-503.pdf

%R CS-TR-75-504
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On sparse graphs with dense long paths.
%A Erdoes, Paul
%A Graham, Ronald L.
%A Szemeredi, Endre
%D September 1975
%X The following problem was raised by H.-J. Stoss in connection
               with certain questions related to the complexity of Boolean
               functions. An acyclic directed graph G is said to have
               property P(m,n) if for any set X of m vertices of G, there is
               a directed path of length n in G which does not intersect X.
               Let f(m,n) denote the minimum number of edges a graph with
               porperty P(m,n) can have. The problem is to estimate f(m,n).
               For the remainder of the paper, we shall restrict ourselves
               to the case m = n. We shall prove
               (1) $c_1$n log n/log log n < f(n,n) < $c_2$n log n
               (where $c_1$,$c_2$,..., will hereafter denote suitable
               positive constaints). In fact, the graph we construct in
               order to establish the upper bound on f(n,n) in (1) will have
               just $c_3$n vertices. In this case the upper bound in (1) is
               essentially best possible since it will also be shown that
               for $c_4$ sufficiently large, every graph on $c_4$n vertices
               having property P(n,n) must have at least $c_5$n log n edges.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/504/CS-TR-75-504.pdf

%R CS-TR-75-505
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Some linear programming aspects of combinatorics.
%A Chvatal, Vaclav
%D September 1975
%X This is the text of a lecture given at the Conference on
               Algebraic Aspects of Combinatorics at the University of
               Toronto in January 1975. The lecture was expository, aimed at
               an audience with no previous knowledge of linear programming.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/505/CS-TR-75-505.pdf

%R CS-TR-75-506
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Operational reasoning and denotational semantics.
%A Gordon, Michael J. C.
%D August 1975
%X "Obviously true" properties of programs can be hard to prove
               when meanings are specified with a denotational semantics.
               One cause of this is that such a semantics usually abstracts
               away from the running process - thus properties which are
               obvious when one thinks about this lose the basis of their
               obviousness in the absence of it. To enable process-based
               intuitions to be used in constructing proofs one can
               associate with the semantics an abstract interpreter so that
               reasoning about the semantics can be done by reasoning about
               computations on the interpreter. This technique is used to
               prove several facts about a semantics of pure LISP. First a
               denotatlonal semantics and an abstract interpreter are
               described. Then it is shown that the denotation of any LISP
               form is correctly computed by the interpreter. This is used
               to justify an inference rule - called "LlSP-induction" -
               which formalises induction on the size of computations on the
               interpreter. Finally LlSP-induction is used to prove a number
               of results. In particular it is shown that the function eval
               is correct relative to the semantics - i.e. that it denotes a
               mapping which maps forms (coded as S-expressions) on to their
               correct values.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/506/CS-TR-75-506.pdf

%R CS-TR-75-507
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Towards a semantic theory of dynamic binding.
%A Gordon, Michael J. C.
%D August 1975
%X The results in this paper contribute to the formulation of a
               semantic theory of dynamic binding (fluid variables). The
               axioms and theorems are language independent in that they
               don't talk about programs - i.e, syntactic objects - but just
               about elements in certain domains. Firstly the equivalence
               (in the circumstances where it's true) of "tying a knot"
               through the environment (elaborated in the paper) and taking
               a least fixed point is shown. This is central in proving the
               correctness of LISP "eval" type interpreters. Secondly the
               relation which must hold between two environments if a
               program is to have the same meaning in both is established.
               It is shown how the theory can be applied to LISP to yield
               previously known facts.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/507/CS-TR-75-507.pdf

%R CS-TR-75-508
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On computing the transitive closure of a relation.
%A Eve, James
%D September 1975
%X An algorithm is presented for computing the transitive
               closure of an arbitrary relation which is based upon a
               variant of Tarjan's algorithm [1972] for finding the strongly
               connected components of a directed graph. This variant leads
               to a more compact statement of Tarjan's algorithm.
               If V is the number of vertices in the directed graph
               representing the relation then the worst case behavior of the
               proposed algorithm involves O($V^3$) operations. In this
               respect it is inferior to existing algorithms which require
               O($V^3$/log V) and O($V^{{log}_2 7}$ log V) operations
               respectively. The best case behavior involves only O($V^2$)
               operations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/508/CS-TR-75-508.pdf

%R CS-TR-75-509
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Finding the maximal incidence matrix of a large graph.
%A Overton, Michael L.
%A Proskurowski, Andrzej
%D September 1975
%X This paper deals with the computation of two canonical
               representations of a graph. A computer program is presented
               which searches for "the maximal incidence matrix" of a large
               connected graph without multiple edges or self-loops. The use
               of appropriate algorithms and data structures is discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/509/CS-TR-75-509.pdf

%R CS-TR-75-511
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Software implementation of a new method of combinatorial
               hashing.
%A Dubost, Pierre
%A Trousse, Jean-Michel
%D September 1975
%X This is a study of the software implementation of a new
               method of searching with retrieval on secondary keys.
               A new family of partial match file designs is presented, the
               'worst case' is determined, a detailed algorithm and program
               are given and the average execution time is studied.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/511/CS-TR-75-511.pdf

%R CS-TR-75-512
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Applications of path compression on balanced trees.
%A Tarjan, Robert Endre
%D August 1975
%X We devise a method for computing functions defined on paths
               in trees. The method is based on tree manipulation techniques
               first used for efficiently representing equivalence
               relations. It has an almost-linear running time. We apply the
               method to give O(m $\alpha$(m,n)) algorithms for two
               problems.
               A. Verifying a minimum spanning tree in an undirected graph
               (best previous bound: O(m log log n) ).
               B. Finding dominators in a directed graph (best previous
               bound: O(n log n + m) ).
               Here n is the number of vertices and m the number of edges in
               the problem graph, and $\alpha$(m,n) is a very slowly growing
               function which is related to a functional inverse of
               Ackermann's function.
               The method is also useful for solving, in O(m $\alpha$(m,n))
               time, certain kinds of pathfinding problems on reducible
               graphs. Such problems occur in global flow analysis of
               computer programs and in other contexts. A companion paper
               will discuss this application.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/512/CS-TR-75-512.pdf

%R CS-TR-75-513
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A survey of techniques for fixed radius near neighbor
               searching.
%A Bentley, Jon Louis
%D August 1975
%X This paper is a survey of techniques used for searching in a
               multidimensional space. Though we consider specifically the
               problem of searching for fixed radius near neighbors (that
               is, all points within a fixed distance of a given point), the
               structures presented here are applicable to many different
               search problems in multidimensional spaces. The orientation
               of this paper is practical; no theoretical results are
               presented. Many areas open for further research are
               mentioned.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/513/CS-TR-75-513.pdf

%R CS-TR-75-514
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A microprogram control unit based on a tree memory.
%A Tokura, Nobuki
%D August 1975
%X A modularized control unit for microprocessors is proposed
               that implements ancestor tree programs. This leads to a
               reduction of storage required for address information. The
               basic architecture is extended to paged tree memory to
               enhance the memory space usage. Finally, the concept of an
               ancestor tree with shared subtrees is introduced, and the
               existence of an efficient algorithm to find sharable subtrees
               is shown.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/514/CS-TR-75-514.pdf

%R CS-TR-75-517
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Distances in orientations of graphs.
%A Chvatal, Vaclav
%A Thomassen, Carsten
%D August 1975
%X We prove that there is a function h(k) such that every
               undirected graph G admits an orientation H with the following
               property: if an edge uv belongs to a cycle of length k in G,
               then uv or vu belongs to a directed cycle of length at most
               h(k) in H. Next, we show that every undirected bridgeless
               graph of radius r admits an orientation of radius at most
               $R^2$+r, and this bound is best possible. We consider the
               same problem with radius replaced by diameter. Finally, we
               show that the problem of deciding whether an undirected graph
               admits an orientation of diameter (resp. radius) two belongs
               to a class of problems called NP-hard.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/517/CS-TR-75-517.pdf

%R CS-TR-75-518
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Aggregation of inequalities in integer programming.
%A Chvatal, Vaclav
%A Hammer, Peter L.
%D August 1975
%X Given an m $\times$ n zero-one matrix $\underset\tilde\to A$
               we ask whether there is a single linear inequality
               $\underset\tilde\to a \underset\tilde\to x \leq b$ whose
               zero-one solutions are precisely the zero-one solutions of
               $\underset\tilde\to A \underset\tilde\to x \leq e$. We
               develop an algorithm for answering this question in O(m$n^2$)
               steps and investigate other related problems. Our results may
               be interpreted in terms of graph theory and threshold logic.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/518/CS-TR-75-518.pdf

%R CS-TR-75-520
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the representation of data structures in LCF with
               applications to program generation.
%A von Henke, Friedrich W.
%D September 1975
%X In this paper we discuss techniques of exploiting the obvious
               relationship between program structure and data structure for
               program generation. We develop methods of program
               specification that are derived from a representation of
               recursive data structures in the Logic for Computable
               Functions (LCF). As a step towards a formal problem
               specification language we define definitional extensions of
               LCF. These include a calculus for (computable) homogeneous
               sets and restricted quantification. Concepts that are
               obtained by interpreting data types as algebras are used to
               derive function definition schemes from an LCF term
               representing a data structure; they also lead to techniques
               for the simplification of expressions in the extended
               language. The specification methods are illustrated with a
               detailed example.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/520/CS-TR-75-520.pdf

%R CS-TR-75-521
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Depth perception in stereo computer vision.
%A Thompson, Clark
%D October 1975
%X This report describes a stereo vision approach to depth
               perception; the author has build upon a set of programs that
               decompose the problem in the following way: 1) Production of
               a camera model: the position and orientation of the cameras
               in 3-space. 2) Generation of matching point-pairs: loci of
               corresponding features in the two pictures. 3) Computation of
               the point in 3-space for each point-pair. 4) Presentation of
               the resultant depth information.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/521/CS-TR-75-521.pdf

%R CS-TR-75-522
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automatic program verification IV: proof of termination
               within a weak logic of programs.
%A Luckham, David C.
%A Suzuki, Norihisa
%D October 1975
%X A weak logic of programs is a formal system in which
               statements that mean "the program halts" cannot be expressed.
               In order to prove termination, we would usually have to use a
               stronger logical system. In this paper we show how we can
               prove termination of both iterative and recursive programs
               within a weak logic by adding pieces of code and placing
               restrictions on loop invariants and entry conditions. Thus,
               most of the existing verifiers which are based on a weak
               logic of programs can be used to prove termination of
               programs without any modification. We give examples of proofs
               of termination and of accurate bounds on computation time
               that were obtained using the Stanford Pascal program
               verifier.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/522/CS-TR-75-522.pdf

%R CS-TR-75-523
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T BAIL: a debugger for SAIL.
%A Reiser, John F.
%D October 1975
%X BAIL is a debugging aid for SAIL programs, where SAIL is an
               extended dialect of ALGOL60 which runs on the PDP-10
               computer. BAIL consists of a breakpoint package and an
               expression interpreter which allow the user to stop his
               program at selected points, examine and change the values of
               variables, and evaluate general SAIL expressions. In
               addition, BAIL can display text from the source file
               corresponding to the current location in the program. In may
               respects BAIL is like DDT or RAID, except that BAIL is
               oriented towards SAIL and knows about SAIL data types,
               primitive operations, and procedure implementation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/523/CS-TR-75-523.pdf

%R CS-TR-75-526
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Graph theory and Gaussian elimination.
%A Tarjan, Robert Endre
%D November 1975
%X This paper surveys graph-theoretic ideas which apply to the
               problem of solving a sparse system of linear equations by
               Gaussian elimination. Included are a discussion of bandwidth,
               profile, and general sparse elimination schemes, and of two
               graph-theoretic partitioning methods. Algorithms based on
               these ideas are presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/526/CS-TR-75-526.pdf

%R CS-TR-75-527
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Center for Reliable Computing: current research.
%A McCluskey, Edward J.
%A Wakerly, John F.
%A Ogus, Roy C.
%D October 1975
%X This report summarizes the research work which has been
               performed, and is currently active in the Center for Reliable
               Computing in the Digital Systems Laboratory, Stanford
               University.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/527/CS-TR-75-527.pdf

%R CS-TR-75-528
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Solving path problems on directed graphs.
%A Tarjan, Robert Endre
%D October 1975
%X This paper considers path problems on directed graphs which
               are solvable by a method similar to Gaussian elimination. The
               paper gives an axiom system for such problems which is a
               weakening of Salomaa's axioms for a regular algebra. The
               paper presents a general solution method which requires
               O($n^3$) time for dense graphs with n vertices and
               considerably less time for sparse graphs.
               The paper also presents a decomposition method which solves a
               path problem by breaking it into subproblems, solving each
               subproblem by elimination, and combining the solutions. This
               method is a generalization of the "reducibility" notion of
               data flow analysis, and is a kind of single-element
               "tearing". Efficiently implemented, the method requires O(m
               $\alpha$(m,n)) time plus time to solve the subproblems, for
               problem graphs with n vertices and m edges. Here
               $\alpha$(m,n) is a very slowly growing function which is a
               functional inverse of Ackermann's function.
               The paper considers variants of the axiom system for which
               the solution methods still work, and presents several
               applications including solving simultaneous linear equations
               and analyzing control flow in computer programs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/528/CS-TR-75-528.pdf

%R CS-TR-75-530
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An adaptive finite difference solver for nonlinear two point
               boundary problems with mild boundary layers.
%A Lentini, M.
%A Pereyra, Victor
%D November 1975
%X A variable order variable step finite difference algorithm
               for approximately solving m-dimensional systems of the form
               y' = f(t,y), t $\in$ [a,b] subject to the nonlinear boundary
               conditions g(y(a),y(b)) = 0 is presented.
               A program, PASVAR, implementing these ideas has been written
               and the results on several test runs are presented together
               with comparisons with other methods. The main features of the
               new procedure are:
               a) Its ability to produce very precise global error
               estimates, which in turn allow a very fine control between
               desired tolerance and actual output precision.
               b) Non-uniform meshes allow an economical and accurate
               treatment of boundary layers and other sharp changes in the
               solutions.
               c) The combination of automatic variable order (via deferred
               corrections) and automatic (adaptive) mesh selection
               produces, as in the case of initial value problem solvers, a
               versatile, robust, and efficient algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/530/CS-TR-75-530.pdf

%R CS-TR-75-531
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Algorithmic aspects of vertex elimination on directed graphs.
%A Rose, Donald J.
%A Tarjan, Robert Endre
%D November 1975
%X We consider a graph-theoretic elimination process which is
               related to performing Gaussian elimination on sparse systems
               of linear eauations. We give efficient algorithms to:
               (1) calculate the fill-in produced by any elimination
               ordering;
               (2) find a perfect elimination ordering if one exists; and
               (3) find a minimal elimination ordering.
               We also show that problems (1) and (2) are at least as
               time-consuming as testing whether a directed graph is
               transitive, and that the problem of finding a minimum
               ordering is NP-complete.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/531/CS-TR-75-531.pdf

%R CS-TR-75-532
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Bibliography of Computer Science Department technical
               reports.
%A Jacobs, Patricia E.
%D November 1975
%X This report lists, in chronological order, all reports from
               the Stanford Computer Science series (STAN-CS-xx-xxx),
               Artificial Intelligence Memos (AIM), Digital Systems
               Laboratory Technical reports (TR) and Technical Notes (TN),
               plus Stanford Linear Accelerator Center publications (SLACP)
               and reports (SLACR).
               Also, for the first time, we have provided an author index
               for these reports (at the end of the report listings). The
               bibliography issued in October of 1973 is hereby brought up
               to date.
               Each report is identified by title, author's name, National
               Technical Information Service (NTIS) retrieval number, date,
               number of pages and the computer science areas treated.
               Subsequent journal publication (when known) is also
               indicated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/532/CS-TR-75-532.pdf

%R CS-TR-75-536
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Interactive generation of object models with a manipulator.
%A Grossman, David D.
%A Taylor, Russell H.
%D December 1975
%X Manipulator programs in a high level language consist of
               manipulation procedures and object model declaratlons. As
               higher level languages are developed, the procedures wlll
               shrink while the declarations will grow. This trend makes it
               desirable to develop means for automating the generation of
               these declarations. A system is proposed which would permit
               users to specify certain object models interactively, using
               the manipulator itself as a measuring tool in three
               dimensions. A preliminary version of the system has been
               tested.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/536/CS-TR-75-536.pdf

%R CS-TR-75-537
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Verification Vision within a programmable assembly system: an
               introductory discussion.
%A Bolles, Robert C.
%D December 1975
%X This paper defines a class of visual feedback tasks called
               "Verification Vision" which includes a significant portion of
               the feedback tasks required within a programmable assembly
               system. It characterizes a set of general-purpose
               capabilities which, if implemented, would provide a user with
               a system in which to write programs to perform such tasks.
               Example tasks and protocols are used to motivate these
               semantic capabilities. Of particular importance are the tools
               required to extract as much information as possible from
               planning and/or training sessions. Four different levels of
               verification systems are discussed. They range from a
               straightforward interactive system which could handle a
               subset of the verification vision tasks, to a completely
               automatic system which could plan its own strategies and
               handle the total range of verification tasks. Several
               unsolved problems in the area are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/537/CS-TR-75-537.pdf

%R CS-TR-75-539
%Z Wed, 23 Aug 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A new approach to recursive programs.
%A Manna, Z ohar
%A Shamir, Adi
%D December 1975
%X In this paper we critically evaluate the classical
               least-fixedpoint approach towards recursive programs. We
               suggest a new approach which extracts the maximal amount of
               valuable information embedded in the programs. The
               presentation is informal, with emphasis on examples.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/539/CS-TR-75-539.pdf

%R CS-TR-75-482
%Z Wed, 10 Jun 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Algroithm for Finding Best Matches in Logarithmic Expected Time
%A Friedman, Jerome
%A Bentley, Jon Louis
%A Finkel, Raphael Ari
%D July 1976
%X An algorithm and data structure are presented for
searching a file containing N records, each described by k real
valued keys, for the m closest matches or nearest neighbors to
a given query record.  The computation required to organize the 
file is proportional to kNlogN.  The expected number of records
examined in each search is independent of the file size.  The
expected computation to perform each search is proportional to logN.
Empirical evidence suggests that except for very small files, this
algorithm is considerably faster than other methods.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/482/CS-TR-75-482.pdf

%R CS-TR-73-330
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Axioms and theorems for integers, lists and finite sets in
               LCF.
%A Newey, Malcolm C.
%D January 1973
%X LCF (Logic for Computable Functions) is being promoted as a
               formal language suitable for the discussion of various
               problems in the Mathematical Theory of Computation (MTC). To
               this end, several examples of MTC problems have been
               formalised and proofs have been exhibited using the LCF
               proof-checker. However, in these examples, there has been a
               certain amount of ad-hoc-ery in the proofs; namely, many
               mathematical theorems have been assumed without proof and no
               axiomatisation of the mathematical domains involved was
               given. This paper describes a suitable mathematical
               environment for future LCF experiments and its axiomatic
               basis. The environment developed, deemed appropriate for such
               experiments, consists of a large body of theorems from the
               areas of integer arithmetic, list manipulation and finite set
               theory.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/330/CS-TR-73-330.pdf

%R CS-TR-73-331
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The computing time of the Euclidean algorithm.
%A Collins, George E.
%D January 1973
%X The maximum, minimum and average computing times of the
               classical Euclidean algorithm for the greatest common divisor
               of two integers are derived, to within codominance, as
               functions of the lengths of the two inputs and the output.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/331/CS-TR-73-331.pdf

%R CS-TR-73-332
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Models of LCF.
%A Milner, Robin
%D January 1973
%X LCF is a deductive system for computable functions proposed
               by D. Scott in 1969 in an unpublished memorandum. The purpose
               of the present paper is to demonstrate the soundness of the
               system with respect to certain models, which are partially
               ordered domains of continuous functions. This demonstration
               was supplied by Scott in his memorandum; the present paper is
               merely intended to make this work more accessible.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/332/CS-TR-73-332.pdf

%R CS-TR-73-333
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the power of programming features.
%A Chandra, Ashok K.
%A Manna, Z ohar
%D January 1973
%X We consider the power of several programming features such as
               counters, pushdown stacks, queues, arrays, recursion and
               equality. In this study program schemas are used as the model
               for computation. The relations between the powers of these
               features is completely described by a comparison diagram.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/333/CS-TR-73-333.pdf

%R CS-TR-73-334
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T URAND: a universal random number generator.
%A Malcolm, Michael A.
%A Moler, Cleve B.
%D January 1973
%X A subroutine for generating uniformly-distributed
               floating-point numbers in the interval [O,1) is presented in
               ANSI standard Fortran. The subroutine, URAND, is designed to
               be relatively machine independent. URAND has undergone
               minimal testing on various machines and is thought to work
               properly on any machine having binary integer number
               representation, integer multiplication modulo m and integer
               addition either modulo m or yielding at least ${log}_2$ (m)
               significant bits, where m is some integral power of 2.
               Upon the first call of URAND, the value of m is automatically
               determined and appropriate constants for a linear
               congruential generator are computed following the suggestions
               of D. E. Knuth, volume 2. URAND is guaranteed to have a
               full-length cycle. Readers are invited to apply their
               favorite statistical tests to URAND, using any binary
               machine, and report the results to the authors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/334/CS-TR-73-334.pdf

%R CS-TR-73-335
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computation of the stationary distribution of an infinite
               Markov matrix.
%A Golub, Gene H.
%A Seneta, Eugene
%D January 1973
%X An algorithm is presented for computing the unique stationary
               distribution of an infinite stochastic matrix possessing at
               least one column whose elements are bounded away from zero.
               Elementwise convergence rate is discussed by means of two
               examples.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/335/CS-TR-73-335.pdf

%R CS-TR-73-337
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Aesthetics systems.
%A Gips, James
%A Stiny, George
%D January 1973
%X The formal structure of aesthetics systems is defined.
               Aesthetics systems provide for the essential tasks of
               interpretation and evaluation in aesthetic analysis.
               Kolmogorov's formulation of information theory is applicable.
               An aesthetics system for a class of non-representational,
               geometric paintings and its application to three actual
               paintings is described in the Appendix.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/337/CS-TR-73-337.pdf

%R CS-TR-73-338
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A finite basis theorem revisited.
%A Klarner, David A.
%D February 1973
%X Let S denote a set of k-dimensional boxes each having
               integral sides. Let $\Gamma$(S) denote the set of all boxes
               which can be filled completely with translates of elements of
               S. It is shown here that S contains a finite subset B such
               that $\Gamma$(B) = $\Gamma$(S). This result was proved for k
               = 1,2 in an earlier paper, but the proof for k > 2 contained
               an error.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/338/CS-TR-73-338.pdf

%R CS-TR-73-339
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computation of the limited information maximum likelihood
               estimator.
%A Dent, Warren T.
%A Golub, Gene H.
%D February 1973
%X Computation of the Limited Information Maximum Likelihood
               Estimator (LIMLE) of the set of coefficients in a single
               equation of a system of interdependent relations is
               sufficiently complicated to detract from other potentially
               interesting properties. Although for finite samples the LIMLE
               has no moments, asymptotically it remains normally
               distributed and retains other properties associated with
               maximum likelihood. The most extensive application of the
               estimator has been made in the Brookings studies. We believe
               that current methods of estimation are clumsy, and present a
               numerically stable estimation schema based on Householder
               transformations and the singular value decomposition. The
               analysis permits a convenient demonstration of equivalence
               with the Two Stage Least Squares Estimator (TSLSE) in the
               instance of just identification.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/339/CS-TR-73-339.pdf

%R CS-TR-73-340
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Notes on a problem involving permutations as subsequences.
%A Newey, Malcolm C.
%D March 1973
%X The problem (attributed to R. M. Karp by Knuth) is to
               describe the sequences of minimum length which contain, as
               subsequences, all the permutations of an alphabet of n
               symbols. This paper catalogs some of the easy observations on
               the problem and proves that the minimum lengths for n=5, n=6
               & n=7 are 19, 28 and 39 respectively. Also presented is a
               construction which yields (for n>2) many appropriate
               sequences of length $n^2$-2n+4 so giving an upper bound on
               length of minimum strings which matches exactly all known
               values.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/340/CS-TR-73-340.pdf

%R CS-TR-73-341
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A heuristic approach to program verification.
%A Katz, Shmuel M.
%A Manna, Z ohar
%D March 1973
%X We present various heuristic techniques for use in proving
               the correctness of computer programs. The techniques are
               designed to obtain automatically the "inductive assertions"
               attached to the loops of the program which previously
               required human "understanding" of the program's performance.
               We distinguish between two general approaches: one in which
               we obtain the inductive assertion by analyzing predicates
               which are known to be true at the entrances and exits of the
               loop ($underline{top-down}$ approach), and another in which
               we generate the inductive assertion directly from the
               statements of the loop ($underline{bottom-up}$ approach).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/341/CS-TR-73-341.pdf

%R CS-TR-73-342
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Matroid partitioning.
%A Knuth, Donald E.
%D March 1973
%X This report discusses a modified version of Edmonds's
               algorithm for partitioning of a set into subsets independent
               in various given matroids. If ${\cal M}_1$,...,${\cal M}_k$
               are matroids defined on a finite set E, the algorithm yields
               a simple necessary and sufficient condition for whether or
               not the elements of E can be colored with k colors such that
               (i) all elements of color j are independent in ${\cal M}_j$,
               and (ii) the number of elements of color j lies between given
               limits, $n_j \leq \| E_j \| \leq {n'}_j$. The algorithm
               either finds such a coloring or it finds a proof that none
               exists, after making at most $n^3$ + $n^2$k tests of
               independence in the given matroids, where n is the number of
               elements in E.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/342/CS-TR-73-342.pdf

%R CS-TR-73-344
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The fourteen primitive actions and their inferences.
%A Schank, Roger C.
%D March 1973
%X In order to represent the conceptual information underlying a
               natural language sentence, a conceptual structure has been
               established that uses the basic actor-action-object
               framework. It was the intent that these structures have only
               one representation for one meaning, regardless of the
               semantic form of the sentence being represented. Actions were
               reduced to their basic parts so as to effect this. It was
               found that only fourteen basic actions were needed as
               building blocks by which all verbs can be represented. Each
               of these actions has a set of actions or states which can be
               inferred when they are present.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/344/CS-TR-73-344.pdf

%R CS-TR-73-345
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The minimum root separation of a polynomial.
%A Collins, George E.
%A Horowitz, Ellis
%D April 1973
%X The minimum root separation of a complex polynomial A is
               defined as the minimum of the distances between distinct
               roots of A. For polynomials with Gaussian integer
               coefficients and no multiple roots, three lower bounds are
               derived for the root separation. In each case the bound is a
               function of the degree, n, of A and the sum, d, of the
               absolute values of the coefficients of A. The notion of a
               semi-norm for a commutative ring is defined, and it is shown
               how any semi-norm can be extended to polynomial rings and
               matrix rings, obtaining a very general analogue of Hadamard's
               determinant theorem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/345/CS-TR-73-345.pdf

%R CS-TR-73-347
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Multidimensional analysis in evaluating a simulation of
               paranoid thought.
%A Colby, Kenneth Mark
%A Hilf, Franklin Dennis
%D May 1973
%X The limitations of Turing's Test as an evaluation procedure
               are reviewed. More valuable are tests which ask expert judges
               to make ratings along multiple dimensions essential to the
               model. In this way the model's weaknesses become clarified
               and the model builder learns where the model must be
               improved.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/347/CS-TR-73-347.pdf

%R CS-TR-73-346
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The rationale for computer based treatment of language
               difficulties in nonspeaking autistic children.
%A Colby, Kenneth Mark
%D March 1973
%X The principles underlying a computer-based treatment method
               for language acquisition in nonspeaking autistic children are
               described. The main principle involves encouragement of
               exploratory learning with minimum adult interference.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/346/CS-TR-73-346.pdf

%R CS-TR-73-348
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T High order finite difference solution of differential
               equations.
%A Pereyra, Victor
%D April 1973
%X These seminar notes give a detailed treatment of finite
               difference approximations to smooth nonlinear two-point
               boundary value problems for second order differential
               equations. Consistency, stability, convergence, and
               asymptotic expansions are discussed. Most results are stated
               in such a way as to indicate extensions to more general
               problems. Successive extrapolations and deferred corrections
               are described and their implementations are explored
               thoroughly. A very general deferred correction generator is
               developed and it is employed in the implementation of a
               variable order, variable (uniform) step method. Complete
               FORTRAN programs and extensive numerical experiments and
               comparisons are included together with a set of 48
               references.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/348/CS-TR-73-348.pdf

%R CS-TR-73-349
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Two papers on the selection problem: Time Bounds for
               Selection [by Manual Blum, Robert W. Floyd, Vaughan Pratt,
               Ronald L. Rivest, and Robert E. Tarjan] and Expected Time
               Bounds for Selection [by Robert W. Floyd and Ronald L.
               Rivest].
%A Blum, Manual
%A Floyd, Robert W.
%A Pratt, Vaughan R.
%A Rivest, Ronald L.
%A Tarjan, Robert Endre
%D April 1973
%X (1) The number of comparisons required to select the i-th
               smallest of n numbers is shown to be at most a linear
               function of n by analysis of a new selection algorithm --
               PICK. Specifically, no more than 5.4305 n comparisons are
               ever required. This bound is improved for extreme values of
               i, and a new lower bound on the requisite number of
               comparisons is also proved.
               (2) A new selection algorithm is presented which is shown to
               be very efficient on the average, both theoretically and
               practically. The number of comparisons used to select the
               i-th smallest of n numbers is n + min(i,n-i) + o(n). A lower
               bound within 9% of the above formula is also derived.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/349/CS-TR-73-349.pdf

%R CS-TR-73-350
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An almost-optimal algorithm for the assembly line scheduling
               problem.
%A Kaufman, Marc T.
%D January 1973
%X This paper considers a solution to the multiprocessor
               scheduling problem for the case where the ordering relation
               between tasks can be represented as a tree. Assume that we
               have n identical processors, and a number of tasks to
               perform. Each task $T_i$ requires an amount of time ${\mu}_i$
               to complete, 0 < ${\u}_i \leq$ k, so that k is an upper bound
               on task length. Tasks are indivisible, so that a processor
               once assigned must remain assigned until the task completes
               (no preemption). Then the "longest path" scheduling method is
               almost-optimal in the following sense:
               Let $\omage$ be the total time required to process all of the
               tasks by the "longest path" algorithm.
               Let ${\omega}_o$ be the minimal time in which all of the
               tasks can be processed.
               Let ${\omega}_p$ be the minimal time to process all of the
               tasks if arbitrary preemption of processors is allowed.
               Then: ${\omega}_p \leq {\omega}_o \leq \omega \leq
               {\omega}_p$ + k - k/n, where n is the number of processors
               available to any of the algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/350/CS-TR-73-350.pdf

%R CS-TR-73-351
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Performance of an I/O channel with multiple paging drums
               (digest edition).
%A Fuller, Samuel H.
%D August 1972
%X For rotating storage units, a paging drum organization is
               known to offer substantially better response time to I/O
               requests than is a more conventional (file) organization
               [Abate and Dubner, 1969; Fuller and Baskett, 1972]. When
               several, asynchronous paging drums are attached to a single
               I/O channel, however, much of the gain in response time due
               to the paging organization is lost; this article investigates
               the reasons for this loss in performance.
               A model of an I/O channel with multiple paging drums is
               presented and we embed into the model a Markov chain that
               closely approximates the behavior of the I/O channel. The
               analysis then leads to the moment generating function of
               sector queue size and the Laplace-Stieltjes transform of the
               waiting time. A significant observation is that the expected
               waiting time for an I/O request to a drum can be divided into
               two terms: one independent of the load of I/O requests to the
               drum and another that monotonically increases with increasing
               load. Moreover, the load varying term of the waiting time is
               nearly proportional to (2 - l/k) where k is the number of
               drums connected to the I/O channel. The validity of the
               Markov chain approximation is examined in several cases by a
               comparison of the analytic results to the actual performance
               of an I/O channel with several paging drums.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/351/CS-TR-73-351.pdf

%R CS-TR-73-352
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The expected difference between the SLTF and MTPT drum
               scheduling disciplines (digest edition).
%A Fuller, Samuel H.
%D August 1972
%X This report is a sequel to an earlier report [Fuller, 1971]
               that develops a minimal-total-processing-time (MTPT) drum
               scheduling algorithm. A quantitative comparison between MTPT
               schedules and shortest-latency-time-first (SLTF) schedules,
               commonly acknowledged as good schedules for drum-like storage
               units, is presented here. The analysis develops an analogy to
               random walks and proves several asymptotic properties of
               collections of records on drums. These properties are
               specialized to the MTPT and SLTF algorithms and it is shown
               that for sufficiently large sets of records, the expected
               processing time of a SLTF schedule is longer than a MTPT
               schedule by the expected record length. The results of a
               simulation study are also presented to show the difference in
               MTPT and SLTF schedules for small sets of records and for
               situations not covered in the analytic discussion.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/352/CS-TR-73-352.pdf

%R CS-TR-73-353
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Random arrivals and MTPT disk scheduling disciplines.
%A Fuller, Samuel H.
%D August 1972
%X This article investigates the application of
               minimal-total-processing-time (MTPT) scheduling disciplines
               to rotating storage units when random arrival of requests is
               allowed. Fixed-head drum and moving-head disk storage units
               are considered and particular emphasis is placed on the
               relative merits of the MTPT scheduling discipline with
               respect to the shortest-latency-time-first (SLTF) scheduling
               discipline. The data presented are the results of simulation
               studies. Situations are discovered in which the MTPT
               discipline is superior to the SLTF discipline, and situations
               are also discovered in which the opposite is true.
               An implementation of the MTPT scheduling algorithm is
               presented and the computational requirements of the algorithm
               are discussed. It is shown that the sorting procedure is the
               most time consuming phase of the algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/353/CS-TR-73-353.pdf

%R CS-TR-73-354
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The number of SDR's in certain regular systems.
%A Klarner, David A.
%D April 1973
%X Let ($a_1$,...,$a_k$) = $\bar{a}$ denote a vector of numbers,
               and let C($\bar{a}$,n) denote the n $\times$ n cyclic matrix
               having ($a_1$,...,$a_k$,0,...,0) as its first row. It is
               shown that the sequences (det C($\bar{a}$,n): n = k,k+1,...)
               and (per C($\bar{a}$,n): n = k,k+1,...) satisfy linear
               homogeneous difference equations with constant coefficients.
               The permanent, per C, of a matrix C is defined like the
               determinant except that one forgets about ${(-1)}^{sign \pi}$
               where $\pi$ is a permutation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/354/CS-TR-73-354.pdf

%R CS-TR-73-355
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An analysis of central processor scheduling in
               multiprogrammed computer systems (digest edition).
%A Price, Thomas G.
%D October 1972
%X A simple finite source model is used to gain insight into the
               effect of central processor scheduling in multiprogrammed
               computer systems. CPU utilization is chosen as the measure of
               performance and this decision is discussed. A relation
               between CPU utilization and flow time is developed. It is
               shown that the shortest-remaining-processing-time discipline
               maximizes both CPU utilization and I/O utilization for the
               queueing model M/G/1/N. An exact analysis of processor
               utilization using shortest-remaining-processing-time
               scheduling for systems with two jobs is given and it is
               observed that the processor utilization is independent of the
               form of the processing time distribution. The effect of the
               CPU processing time distribution on performance is discussed.
               For first-come-first-served scheduling, it is shown that
               distributions with the same mean and variance can yield
               significantly different processor utilizations and that
               utilization may or may not significantly decrease with
               increasing variance. The results are used to compare several
               scheduling disciplines of practical interest. An approximate
               expression for CPU utilization using
               shortest-remaining-processing-time scheduling in systems with
               N jobs is given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/355/CS-TR-73-355.pdf

%R CS-TR-73-356
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MLISP2.
%A Smith, David Canfield
%A Enea, Horace J.
%D May 1973
%X MLISP2 is a high-level programming language based on LISP.
               Features:
               1. The notation of MLISP. 2. Extensibility---the ability to
               extend the language and to define new languages. 3. Pattern
               matching---the ability to match input against context free or
               sensitive patterns. 4. Backtracking--the ability to set
               decision points, manipulate contexts and backtrack.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/356/CS-TR-73-356.pdf

%R CS-TR-73-357
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A conceptually based sentence paraphraser.
%A Goldman, Neil M.
%A Riesbeck, Christopher K.
%D May 1973
%X This report describes a system of programs which performs
               natural language processing based on an underlying language
               free (conceptual) representation of meaning. This system is
               used to produce sentence paraphrases which demonstrate a form
               of understanding with respect to a given context. Particular
               emphasis has been placed on the major subtasks of language
               analysis (mapping natural language into conceptual
               structures) and language generation (mapping conceptual
               structures into natural language), and on the interaction
               between these processes and a conceptual memory model.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/357/CS-TR-73-357.pdf

%R CS-TR-73-358
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Inference and the computer understanding of natural language.
%A Schank, Roger C.
%A Rieger, Charles J., III
%D May 1973
%X The notion of computer understanding of natural language is
               examined relative to inference mechanisms designed to
               function in a language-free deep conceptual base (Conceptual
               Dependency). The conceptual analysis of a natural language
               sentence into this conceptual base, and the nature of the
               memory which stores and operates upon these conceptual
               structures are described from both theoretical and practical
               standpoints. The various types of inferences which can be
               made during and after the conceptual analysis of a sentence
               are defined, and a functioning program which performs these
               inference tasks is described. Actual computer output is
               included.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/358/CS-TR-73-358.pdf

%R CS-TR-73-360
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Open, closed, and mixed networks of queues with different
               classes of customers.
%A Muntz, Richard R.
%A Baskett, Forest, III
%D August 1972
%X We derive the joint equilibrium distribution of queue sizes
               in a network of queues containing N service centers and R
               classes of customers. The equilibrium state probabilities
               have the general form:
               P(S) - Cd(S) $f_1$($x_1$)$f_2$($x_2$)...$f_N$($x_N$)
               where S is the state of the system, $x_i$ is the
               configuration of customers at the ith service center, d(S) is
               a function of the state of the model, $f_i$ is a function
               that depends on the type of the ith service center, and C is
               a normalizing constant. We consider four types of service
               centers to model central processors, data channels,
               terminals, and routing delays. The queueing disciplines
               associated with these service centers include
               first-come-first-served, processor sharing, no queueing, and
               last-come-first-served. Each customer belongs to a single
               class of customers while awaiting or receiving service at a
               service center but may change classes and service centers
               according to fixed probabilities at the completion of a
               service request. For open networks we consider state
               dependent arrival processes. Closed networks are those with
               no arrivals. A network may be closed with respect to some
               classes of customers and open with respect to other classes
               of customers. At three of the four types of service centers,
               the service times of customers are governed by probability
               distributions having rational Laplace transforms, different
               classes of customers having different distributions. At
               first-come-first-served type service centers the service time
               distribution must be identical and exponential for all
               classes of customers. Many of the network results of Jackson
               on arrival and service rate dependencies, of Posner and
               Bernholtz on different classes of customers, and of Chandy on
               different types of service centers are combined and extended
               in this paper. The results become special cases of the model
               presented here. An example shows how different classes of
               customers can affect models of computer systems.
               Finally, we show that an equivalent model encompassing all of
               the results involves only classes of customers with identical
               exponentially distributed service times. All of the other
               structure of the first model can be absorbed into the fixed
               probabilities governing the change of class and change of
               service center of each class of customers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/360/CS-TR-73-360.pdf

%R CS-TR-73-361
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An algorithm for the construction of the graphs of organic
               molecules.
%A Brown, Harold
%A Masinter, Larry M.
%D May 1973
%X A description and a formal proof of an efficient computer
               implemented algorithm for the construction of graphs is
               presented. This algorithm, which is part of a program for the
               automated analysis of organic compounds, constructs all of
               the non-isomorphic, connected multi-graphs based on a given
               degree sequence of nodes and which arise from a relatively
               small "catolog" of certain canonical graphs. For the graphs
               of the more common organic molecules, a catolog of most of
               the canonical graphs is known, and the algorithm can produce
               all of the distinct valence isomers of these organic
               molecules.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/361/CS-TR-73-361.pdf

%R CS-TR-73-364
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Estimation of probability density using signature tables for
               appplications to pattern recognition.
%A Thosar, Ravindra B.
%D May 1973
%X Signature table training method consists of cumulative
               evaluation of a function (such as a probability density) at
               pre-assigned co-ordinate values of input parameters to the
               table. The training is conditional: based on a binary valued
               "learning" input to a table which is compared to the label
               attached to each training sample. Interpretation of an
               unknown sample vector is then equivalent of a table look-up,
               i.e. extraction of the function value stored at the proper
               co-ordinates. Such a technique is very useful when a large
               number of samples must be interpreted as in the case of
               speech recognition and the time required for the training as
               well as for the recognition is at a premium. However, this
               method is limited by prohibitive storage requirements, even
               for a moderate number of parameters, when their relative
               independence cannot be assumed. This report investigates the
               conditions under which the higher dimensional probability
               density function can be decomposed so that the density
               estimate is obtained by a hierarchy of signature tables with
               consequent reduction in the storage requirement. Practical
               utility of the theoretical results obtained in the report is
               demonstrated by a vowel recognition experiment.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/364/CS-TR-73-364.pdf

%R CS-TR-73-365
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automatic program verification I: a logical basis and its
               implementation.
%A Igarashi, Shigeru
%A London, Ralph L.
%A Luckham, David C.
%D May 1973
%X Defining the semantics of programming languages by axioms and
               rules of inference yields a deduction system within which
               proofs may be given that programs satisfy specifications. The
               deduction system herein is shown to be consistent and also
               deduction complete with respect to Hoare's system. A
               subgoaler for the deduction system is described whose input
               is a significant subset of Pascal programs plus inductive
               assertions. The output is a set of verification conditions or
               lemmas to be proved. Several non-trivial arithmetic and
               sorting programs have been shown to satisfy specifications by
               using an interactive theorem prover to automatically generate
               proofs of the verification conditions. Additional components
               for a more powerful verification system are under
               construction.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/365/CS-TR-73-365.pdf

%R CS-TR-73-368
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The goals of linguistic theory revisited.
%A Schank, Roger C.
%A Wilks, Yorick A.
%D May 1973
%X We examine the original goals of generative linguistic
               theory. We suggest that these goals were well defined but
               misguided with respect to their avoidance of the problem of
               modelling performance. With developments such as Generative
               Semantics, it is no longer clear that the goals are clearly
               defined. We argue that it is vital for linguistics to concern
               itself with the procedures that humans use in language. We
               then introduce a number of basic human competencies, in the
               field of language understanding, understanding in context and
               the use of inferential information, and argue that the
               modelling of these aspects of language understanding requires
               procedures of a sort that cannot be easily accomodated within
               the dominant paradigm. In particular, we argue that the
               procedures that will be required in these cases ought to be
               linguistic, and that the simple-minded importation of
               techniques from logic may create a linguistics in which there
               cannot be procedures of the required sort.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/368/CS-TR-73-368.pdf

%R CS-TR-73-369
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The development of conceptual structures in children.
%A Schank, Roger C.
%D May 1973
%X Previous papers by the author have hypothesized that it is
               possible to represent the meaning of natural language
               sentences using a framework which has only fourteen primitive
               ACTs. This paper addresses the problem of when and how these
               ACTs might be learned by children. The speech of a child of
               age 2 is examined for possible knowledge of the primitive
               ACTs as well as the conceptual relations underlying language.
               It is shown that there is evidence that the conceptual
               structures underlying language are probably complete by age
               2. Next a child is studied from birth to age 1. The emergence
               of the primitive ACTs and the conceptual relations is traced.
               The hypothesis is made that the structures that underlie and
               are necessary for language are present by age 1.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/369/CS-TR-73-369.pdf

%R CS-TR-73-371
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A review of "Structured Programming".
%A Knuth, Donald E.
%D June 1973
%X The recent book $\underline{Structured Programming} by 0. J.
               Dahl, E. W. Dijkstra, and C. A. R. Hoare promises to have a
               significant impact on computer science. This report contains
               a detailed review of the topics treated in that book, in the
               form of three informal "open letters" to the three authors.
               It is hoped that circulation of these letters to a wider
               audience at this time will help to promote useful discussion
               of the important issues.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/371/CS-TR-73-371.pdf

%R CS-TR-73-373
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T SAIL user manual.
%A VanLehn, Kurt A.
%D July 1973
%X SAIL is a high-level programming language for the PDP-10
               computer. It includes an extended ALGOL 60 compiler and a
               companion set of execution-time routines. In addition to
               ALGOL, the language features: (1) flexible linking to
               hand-coded machine language algorithms, (2) complete access
               to the PDP-10 I/O facilities, (3) a complete system of
               compile-time arithmetic and logic as well as a flexible macro
               system, (4) user modifiable error handling, (5) backtracking,
               and (6) interrupt facilities. Furthermore, a subset of the
               SAIL language, called LEAP, provides facilities for (1) sets
               and lists, (2) an associative data structure, (3) independent
               processes, and (4) procedure variables. The LEAP subset of
               SAIL is an extension of the LEAP language, which was designed
               by J. Feldman and P. Rovner, and implemented on Lincoln
               Laboratory's TX-2 (see [Feldman & Rovner, "An Algol-Based
               Associative Language," Communications of the ACM, v.12, no. 8
               (Aug. 1969), pp.439-449]). The extensions to LEAP are
               partially described in "Recent Developments is SAIL" (see
               [Feldman et al., Proceedings of the AFIPS Fall Joint Computer
               Conference, 1972, pp. 1193-1202]).
               This manual describes the SAIL language and the
               execution-time routines for the typical SAIL user: a
               non-novice programmer with some knowledge of ALGOL. It lies
               somewhere between being a tutorial and a reference manual.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/373/CS-TR-73-373.pdf

%R CS-TR-73-376
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Lower estimates for the error of best uniform approximation.
%A Meinardus, Guenter
%A Taylor, Gerald D.
%D July 1973
%X In this paper the lower bounds of de La Vallee Poussin and
               Remes for the error of best uniform approximation from a
               linear subspace are generalized to give analogous estimates
               based on k points, k = 1,...,n.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/376/CS-TR-73-376.pdf

%R CS-TR-73-378
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The optimum comb method of pitch period analysis of
               continuous digitized speech.
%A Moorer, James Anderson
%D July 1973
%X A new method of tracking the fundamental frequency of voiced
               speech is described. The method is shown to be of similar
               accuracy as the Cepstrum technique. Since the method involves
               only additions, no multiplication, it is shown to be faster
               than the SIFT algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/378/CS-TR-73-378.pdf

%R CS-TR-73-382
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Axiomatic approach to total correctness of programs.
%A Manna, Z ohar
%A Pnueli, Amir
%D July 1973
%X We present here an axiomatic approach which enables one to
               prove by formal methods that his program is "totally correct"
               (i.e., it terminates and is logically correct -- does what it
               is supposed to do). The approach is similar to Hoare's
               approach for proving that a program is "partially correct"
               (i.e., that whenever it terminates it produces correct
               results). Our extension to Hoare's method lies in the
               possibility of proving correctness $underline{and}$
               termination at once, and in the enlarged scope of properties
               that can be proved by it.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/382/CS-TR-73-382.pdf

%R CS-TR-73-383
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Natural language inference.
%A Wilks, Yorick A.
%D August 1973
%X The paper describes the way in which a Preference Semantics
               system for natural language analysis and generation tackles a
               difficult class of anaphoric inference problems (finding the
               correct referent for an English pronoun in context): those
               requiring either analysis (conceptual) knowledge of a complex
               sort, or requiring weak inductive knowledge of the course of
               events in the real world. The method employed converts all
               available knowledge to a canonical template form and
               endeavors to create chains of non-deductive inferences from
               the unknowns to the possible referents. Its method of
               selecting among possible chains of inferences is consistent
               with the overall principle of "semantic preference" used to
               set up the original meaning representation, of which these
               anaphoric inference procedures are a manipulation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/383/CS-TR-73-383.pdf

%R CS-TR-73-384
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The generation of French from a semantic representation.
%A Herskovits, Annette
%D August 1973
%X The report contains first a brief description of Preference
               Semantics, a system of representation and analysis of the
               meaning structure of natural language. The analysis algorithm
               which transforms phrases into semantic items called templates
               has been considered in detail elsewhere, so this report
               concentrates on the second phase of analysis, which binds
               templates together into a higher level semantic block
               corresponding to an English paragraph, and which, in
               operation, interlocks with the French generation procedure.
               During this phase, the semantic relations between templates
               are extracted, pronouns are referred and those word
               disambiguations are done that require the context of a whole
               paragraph. These tasks require items called PARAPLATES which
               are attached to keywords such as prepositions, subjunctions
               and relative pronouns. The system chooses the representation
               which maximises a carefully defined "semantic density."
               A system for the generation of French sentences is then
               described, based on the recursive evaluation of procedural
               generation patterns called STEREOTYPES. The stereotypes are
               semantically context sensitive, are attached to each sense of
               English words and keywords and are carried into the
               representation by the analysis procedure. The representation
               of the meaning of words, and the versatility of the
               stereotype format, allow for fine meaning distinctions to
               appear in the French, and for the construction of French
               differing radically from the English original.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/384/CS-TR-73-384.pdf

%R CS-TR-73-385
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Recognition of continuous speech: segmentation and
               classification using signature table adaptation.
%A Thosar, Ravindra B.
%D September 1973
%X This report explores the possibility of using a set of
               features for segmentation and recognition of continuous
               speech. The features are not necessarily "distinctive" or
               minimal, in the sense that they do not divide the phonemes
               into mutually exclusive subsets, and can have high
               redundancy. This concept of feature can thus avoid apriori
               binding between the phoneme categories to be recognized and
               the set of features defined in a particular system.
               An adaptive technique is used to find the probability of the
               presence of a feature. Each feature is treated independently
               of other features. An unknown utterance is thus represented
               by a feature graph with associated probabilities. It is hoped
               that such a representation would be valuable for a
               hypothesize-test paradigm as opposed to one which operates on
               a linear symbolic input.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/385/CS-TR-73-385.pdf

%R CS-TR-73-386
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A corner finder for visual feedback.
%A Perkins, W. A.
%A Binford, Thomas O.
%D August 1973
%X In visual-feedback work often a model of an object and its
               approximate location are known and it is only necessary to
               determine its location and orientation more accurately. The
               purpose of the program described herein is to provide such
               information for the case in which the model is an edge or
               corner. Given a model of a line or a corner with two or three
               edges, the program searches a TV window of arbitrary size
               looking for one or all corners which match the model. A
               model-driven program directs the search. It calls on another
               program to find all lines inside the window. Then it looks at
               these lines and eliminates lines which cannot match any of
               the model lines. It next calls on a program to form vertices
               and then checks for a matching vertex. If this simple
               procedure fails, the model-driver has two backup procedures.
               First it works with the lines that it has and tries to form a
               matching vertex (corner). If this fails, it matches parts of
               the model with vertices and lines that are present and then
               takes a careful look in a small region in which it expects to
               find a missing line. The program often finds weak contrast
               edges in this manner. Lines are found by a global method
               after the entire window has been scanned with the Hueckel
               edge operator.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/386/CS-TR-73-386.pdf

%R CS-TR-73-387
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Analysis of behavior of chemical molecules: rule formation on
               non-homogeneous classes of objects.
%A Buchanan, Bruce G.
%A Sridharan, Natesa S.
%D August 1973
%X An information processing model of some important aspects of
               inductive reasoning is presented within the context of one
               scientific discipline. Given a collection of experimental
               (mass spectrometry) data from several chemical molecules the
               computer program described here separates the molecules into
               "well-behaved" subclasses and selects from the space of all
               explanatory processes the "characteristic" processes for each
               subclass. The definitions of "well-behaved" and
               "characteristic" embody several heuristics which are
               discussed. Some results of the program are discussed which
               have been useful to chemists and which lend credibility to
               this approach.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/387/CS-TR-73-387.pdf

%R CS-TR-73-388
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Interconnections for parallel memories to unscramble
               p-ordered vectors.
%A Swanson, Roger C.
%D May 1973
%X Several methods are being considered for storing arrays in a
               parallel memory system so that various useful partitions of
               an array can be fetched from the memory with a single access.
               Some of these methods fetch vectors in an order scrambled
               from that required for a computation. This paper considers
               the problem of unscrambling such vectors when the vectors
               belong to a class called p-ordered vectors and the memory
               system consists of a prime number of modules.
               Pairs of interconnections are described that can unscramble
               p-ordered vectors in a number of steps that grows as the
               square root of the number of memories. Lower and upper bounds
               are given for the number of steps to unscramble the worst
               case vector. The upper bound calculation that is derived also
               provides an upper bound on the minimum diameter of a star
               polygon with a fixed number of nodes and two
               interconnections. An algorithm is given that has produced
               optimal pairs of interconnections for all sizes of memory
               that have been tried. The algorithm appears to find optimal
               pairs for all memory sizes, but no proof has yet been found.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/388/CS-TR-73-388.pdf

%R CS-TR-73-390
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A construction for the inverse of a Turing machine.
%A Gips, James
%D August 1973
%X A direct construction for the inverse of a Turing machine is
               presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/390/CS-TR-73-390.pdf

%R CS-TR-73-391
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Search strategies for the task of organic chemical synthesis.
%A Sridharan, Natesa S.
%D October 1973
%X A computer program has been written that successfully
               discovers syntheses for complex organic chemical molecules.
               The definition of the search space and strategies for
               heuristic search are described in this paper.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/391/CS-TR-73-391.pdf

%R CS-TR-73-392
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sorting and Searching - errata and addenda.
%A Knuth, Donald E.
%D October 1973
%X This report lists all the typographical errors, in
               $underline{The Art of Computer Programming / Volume 3}$, that
               are presently known to its author. Several recent
               developments and references to the literature, which will be
               incorporated in the second printing, are also included in an
               attempt to keep the book up-to-date. Several dozen
               corrections to the second (1971) printing of volume two are
               also included.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/392/CS-TR-73-392.pdf

%R CS-TR-73-394
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel programming: an axiomatic approach.
%A Hoare, C. A. R.
%D October 1973
%X This paper develops some ideas expounded in [C.A.R. Hoare.
               "Towards a Theory of Parallel Programming," in
               $\underline{Operating Systems Techniques}, ed. C.A.R. Hoare
               and R.H. Perrot. Academic Press. 1972]. It distinguishes a
               number of ways of using parallelism, including disjoint
               processes, competition, cooperation, communication and
               "colluding". In each case an axiomatic proof rule is given.
               Some light is thrown on traps or ON conditions. Warning: the
               program structuring methods described here are not suitable
               for the construction of operating systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/394/CS-TR-73-394.pdf

%R CS-TR-73-396
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The use of sensory feedback in a programmable assembly
               system.
%A Bolles, Robert C.
%A Paul, Richard P.
%D October 1973
%X This article describes an experimental, automated assembly
               system which uses sensory feedback to control an
               electro-mechanical arm and TV camera. Visual, tactile, and
               force feedback are used to improve positional information,
               guide manipulations, and perform inspections. The system has
               two phases: a 'planning' phase in which the computer is
               programmed to assemble some object, and a 'working' phase in
               which the computer controls the arm and TV camera in actually
               performing the assembly. The working phase is designed to be
               run on a mini-computer.
               The system has been used to assemble a water pump, consisting
               of a base, gasket, top, and six screws. This example is used
               to explain how the sensory data is incorporated into the
               control system. A movie showing the pump assembly is
               available from the Stanford Artificial Intelligence
               Laboratory.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/396/CS-TR-73-396.pdf

%R CS-TR-73-398
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Image contouring and comparing.
%A Baumgart, Bruce G.
%D October 1973
%X A contour image representation is stated and an algorithm for
               converting a set of digital television images into this
               representation is explained. The algorithm consists of five
               steps: digital image thresholding, binary image contouring,
               polygon nesting, polygon smoothing, and polygon comparing. An
               implementation of the algorithm is the main routine of a
               program called CRE; auxiliary routines provide cart and turn
               table control, TV camera input, image display, and xerox
               printer output. A serendip application of CRE to type font
               construction is explained. Details about the intended
               application of CRE to the perception of physical objects will
               appear in sequels to this paper.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/398/CS-TR-73-398.pdf

%R CS-TR-73-401
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Monitors: an operating system structuring concept.
%A Hoare, C. A. R.
%D November 1973
%X This paper develops Brinch-Hansen's concept of a monitor as a
               method of structuring an operating system. It introduces a
               form of synchronization, describes a possible method of
               implementation in terms of semaphores, and gives a suitable
               proof rule. Illustrative examples include a single resource
               scheduler, a bounded buffer, an alarm clock, a buffer pool, a
               disc head optimizer, and a version of the problem of readers
               and writers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/401/CS-TR-73-401.pdf

%R CS-TR-73-403
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Hints on programming language design.
%A Hoare, C. A. R.
%D December 1973
%X This paper (based on a keynote address presented at the
               SIGACT/SIGPLAN Symposium on Principles of Programming
               Languages, Boston, October 1-3, 1973) presents the view that
               a programming language is a tool which should assist the
               programmer in the most difficult aspects of his art, namely
               program design, documentation, and debugging. It discusses
               the objective criteria for evaluating a language design, and
               illustrates them by application to language features of both
               high level languages and machine code programming. It
               concludes with an annotated reading list, recommended for all
               intending language designers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/403/CS-TR-73-403.pdf

%R CS-TR-73-379
%Z Mon, 25 Sep 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The hetrodyne filter as a tool for analysis of transient
               waveforms.
%A Moorer, James Anderson
%D July 1973
%X A method of analysis of transient waveforms is discussed. Its
               properties and limitations are presented in the context of
               musical tones. The method is shown to be useful when the
               risetimes of the partials of the tone are not too short. An
               extension to inharmonic partials and polyphonic musical sound
               is discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/73/379/CS-TR-73-379.pdf

%R CS-TR-72-252
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Large-scale linear programming using the Cholesky
               factorization.
%A Saunders, Michael A.
%D January 1972
%X A variation of the revised simplex method is proposed for
               solving the standard linear programming problem. The method
               is derived from an algorithm recently proposed by Gill and
               Murray, and is based upon the orthogonal factorization B = LQ
               or, equivalently, upon the Cholesky factorization ${BB}^T =
               {LL}^T$ where B is the usual square basis, L is lower
               triangular and Q is orthogonal.
               We wish to retain the favorable numerical properties of the
               orthogonal factorization, while extending the work of Gill
               and Murray to the case of linear programs which are both
               large and sparse. The principal property exploited is that
               the Cholesky factor L depends only on $underline{which}$
               variables are in the basis, and not upon the
               $underline{order}$ in which they happen to enter. A
               preliminary ordering of the rows of the full data matrix
               therefore promises to ensure that L will remain sparse
               throughout the iterations of the simplex method.
               An initial (in-core) version of the algorithm has been
               implemented in Algol W on the IBM 360/91 and tested on
               several medium-scale problems from industry (up to 930
               constraints). While performance has not been especially good
               on problems of high density, the method does appear to be
               efficient on problems which are very sparse, and on
               structured problems which have either generalized upper
               bounding, block-angular, or staircase form.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/252/CS-TR-72-252.pdf

%R CS-TR-72-253
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Total complexity and the inference of best programs.
%A Feldman, Jerome A.
%A Shields, Paul C.
%D April 1972
%X Axioms for a total complexity measure for abstract programs
               are presented. Essentially, they require that total
               complexity be an unbounded increasing function of the Blum
               time and size measures. Algorithms for finding the best
               program on a finite domain are presented, and their limiting
               behaviour for infinite domains described. For total
               complexity, there are important senses in which a machine
               $underline{can}$ find the best program for a large class of
               functions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/253/CS-TR-72-253.pdf

%R CS-TR-72-254
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Von Neumann's comparison method for random sampling from the
               normal and other distributions.
%A Forsythe, George E.
%D January 1972
%X The author presents a generalization he worked out in 1950 of
               von Neumann's method of generating random samples from the
               exponential distribution by comparisons of uniform random
               numbers on (0,1). It is shown how to generate samples from
               any distribution whose probability density function is
               piecewise both absolutely continuous and monotonic on
               ($-\infty$,$\infty$). A special case delivers normal deviates
               at an average cost of only 4.036 uniform deviates each. This
               seems more efficient than the Center-Tail method of Dieter
               and Ahrens, which uses a related, but different, method of
               generalizing the von Neumann idea to the normal distribution.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/254/CS-TR-72-254.pdf

%R CS-TR-72-255
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automatic programming.
%A Feldman, Jerome A.
%D February 1972
%X The revival of interest in Automatic Programming is
               considered. The research is divided into direct efforts and
               theoretical developments and the successes and prospects of
               each are described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/255/CS-TR-72-255.pdf

%R CS-TR-72-256
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Edmonds polyhedra and weakly hamiltonian graphs.
%A Chvatal, Vaclav
%D January 1972
%X Jack Edmonds developed a new way of looking at extremal
               combinatorial problems and applied his technique with a great
               success to the problems of the maximal-weight
               degree-constrained subgraphs. Professor C. St. J. A.
               Nash-Williams suggested to use Edmonds' approach in the
               context of hamiltonian graphs. In the present paper, we
               determine a new set of inequalities (the "comb inequalities")
               which are satisfied by the characteristic functions of
               hamiltonian circuits but are not explicit in the
               straightforward integer programming formulation. A direct
               application of the linear programming duality theorem then
               leads to a new necessary condition for the existence of
               hamiltonian circuits; this condition appears to be stronger
               than the previously known ones. Relating linear programming
               to hamiltonian circuits, the present paper can also be seen
               as a continuation of the work of Dantzig, Fulkerson and
               Johnson on the travelling salesman problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/256/CS-TR-72-256.pdf

%R CS-TR-72-257
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On "PASCAL," code generation, and the CDC 6000 computer.
%A Wirth, Niklaus
%D February 1972
%X "PASCAL" is a general purpose programming language with
               characteristics similar to ALGOL 60, but with an enriched set
               of program- and data structuring facilities. It has been
               implemented on the CDC 6000 computer. This paper discusses
               selected topics of code generation, in particular the
               selection of instruction sequences to represent simple
               operations on arithmetic, Boolean, and powerset operands.
               Methods to implement recursive procedures are briefly
               described, and it is hinted that the more sophisticated
               solutions are not necessarily also the best. The CDC 6000
               architecture appears as a frequent source of pitfalls and
               nuisances, and its main trouble spots are scrutinized and
               discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/257/CS-TR-72-257.pdf

%R CS-TR-72-258
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Some basic machine algorithms for integral order
               computations.
%A Brown, Harold
%D February 1972
%X Three machine implemented algorithms for computing with
               integral orders are described. The algorithms are:
               1. For an integral order R given in terms of its left regular
               representation relative to any basis, compute the nil radical
               J(R) and a left regular representation of R/J(R).
               2. For a semisimple order R given in terms of its left
               regular representation relative to any basis, compute a new
               basis for R and the associated left regular representation of
               R such that the first basis element of the transformed basis
               is an integral multiple of the identity element in Q
               $\bigotimes$ R.
               3. Relative to any fixed Z -basis for R, compute a unique
               canonical form for any given finitely generated Z -submodule
               of Q $\bigotimes$ R described in terms of that basis.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/258/CS-TR-72-258.pdf

%R CS-TR-72-261
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The differentiation of pseudoinverses and nonlinear least
               squares problems whose variables separate.
%A Golub, Gene H.
%A Pereyra, Victor
%D February 1972
%X For given data ($t_i\ , y_i), i=1, \ldots ,m$ , we consider
               the least squares fit of nonlinear models of the form
               F($\underset ~\to a\ , \underset ~\to \alpha\ ; t) =
               \sum_{j=1}^{n}\ g_j (\underset ~\to a ) \varphi_j (\underset
               ~\to \alpha\ ; t) , \underset ~\to a\ \epsilon R^s\ ,
               \underset ~\to \alpha\ \epsilon R^k\ $.
               For this purpose we study the minimization of the nonlinear
               functional
               r($\underset ~\to a\ , \underset ~\to \alpha ) =
               \sum_{i=1}^{m} {(y_i - F(\underset ~\to a , \underset ~\to
               \alpha , t_i))}^2$.
               It is shown that by defining the matrix ${ \{\Phi (\underset
               ~\to \alpha\} }_{i,j} = \varphi_j (\underset ~\to \alpha ;
               t_i)$ , and the modified functional $r_2(\underset ~\to
               \alpha ) = \l\ \underset ~\to y\ - \Phi (\underset ~\to
               \alpha )\Phi^+(\underset ~\to \alpha ) \underset ~\to y
               \l_2^2$, it is possible to optimize first with respect to the
               parameters $\underset ~\to \alpha$ , and then to obtain, a
               posteriori, the optimal parameters $\overset ^\to {\underset
               ~\to a}$. The matrix $\Phi^+(\underset ~\to \alpha$) is the
               Moore-Penrose generalized inverse of $\Phi (\underset ~\to
               \alpha$), and we develop formulas for its Frechet derivative
               under the hypothesis that $\Phi (\underset ~\to \alpha$) is
               of constant (though not necessarily full) rank. From these
               formulas we readily obtain the derivatives of the orthogonal
               projectors associated with $\Phi (\underset ~\to \alpha$),
               and also that of the functional $r_2(\underset ~\to \alpha$).
               Detailed algorithms are presented which make extensive use of
               well-known reliable linear least squares techniques, and
               numerical results and comparisons are given. These results
               are generalizations of those of H. D. Scolnik [1971].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/261/CS-TR-72-261.pdf

%R CS-TR-72-263
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A procedure for improving the upper bound for the number of
               n-ominoes.
%A Klarner, David A.
%A Rivest, Ronald L.
%D February 1972
%X An n-omino is a plane figure composed of n unit squares
               joined together along their edges. Every n-omino is generated
               by joining the edge of a unit square to the edge of a unit
               square in some (n-1)-omino so that the new square does not
               overlap any squares. Let t(n) denote the number of n-ominoes,
               then it is known that the sequence ${((t(n))}^{1/n} : n =
               1,2,\ldots )$ increases to a limit $\Theta$ , and 3.72 <
               $\Theta$ < 6.75 . A procedure exists for computing an
               increasing sequence of numbers bounded above by $\Theta$.
               (Chandra recently showed that the limit of this sequence is
               $\Theta$ .) In the present work we give a procedure for
               computing a sequence of numbers bounded below by $\Theta$ .
               Whether or not the limit of this sequence is $\Theta$ remains
               an open question. By computing the first ten terms of our
               sequence, we have shown that $\Theta$ < 4.65 .
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/263/CS-TR-72-263.pdf

%R CS-TR-72-264
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An artificial intelligence approach to machine translation.
%A Wilks, Yorick A.
%D February 1972
%X The paper describes a system of semantic analysis and
               generation, programmed in LISP 1.5 and designed to pass from
               paragraph length input in English to French via an
               interlingual representation. A wide class of English input
               forms will be covered, but the vocabulary will initially be
               restricted to one of a few hundred words. With this subset
               working, and during the current year (71-72), it is also
               hoped to map the interlingual representation onto some
               predicate calculus notation so as to make possible the
               answering of very simple questions about the translated
               matter. The specification of the translation system itself is
               complete, and its main points of interest that distinguish it
               from other systems are:
               i) It translated phrase by phrase -- with facilities for
               reordering phrases and establishing essential semantic
               connectivities between them -- by mapping complex semantic
               structures of "message" onto each phrase. These constitute
               the interlingual representation to be translated. This
               matching is done without the explicit use of a conventional
               syntax analysis, by taking as the appropriate matched
               structure the "most dense" of the alternative structures
               derived. This method has been found highly successful in
               earlier versions of this analysis system.
               ii) The French output strings are generated without the
               explicit use of a generative grammar. That is done by means
               of STEREOTYPES: strings of French words, and functions
               evaluating to French words, which are attached to English
               word senses in the dictionary and built into the interlingual
               representation by the analysis routines. The generation
               program thus receives an interlingual representation that
               already contains both French output and implicit procedures
               for assembling the output, since the stereotypes are in
               effect recursive procedures specifying the content and
               production of the ouput word strings. Thus the generation
               program at no time consults a word dictionary or inventory of
               grammar rules.
               It is claimed that the system of notation and translation
               described is a convenient one for expressing and handling the
               items of semantic information that are ESSENTIAL to any
               effective MT system, I discuss in some detail the semantic
               information needed to ensure the correct choice of output
               prepositions in French, a vital matter inadequately treated
               by virtually all previous formalisms and projects.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/264/CS-TR-72-264.pdf

%R CS-TR-72-265
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Primitive concepts underlying verbs of thought.
%A Schank, Roger C.
%A Goldman, Neil M.
%A Rieger, Charles J.
%A Riesbeck, Christopher K.
%D February 1972
%X In order to create conceptual structures that will uniquely
               and unambiguously represent the meaning of an utterance, it
               is necessary to establish 'primitive' underlying actions and
               states into which verbs can be mapped. This paper presents
               analyses of the most common mental verbs in terms of such
               primitive actions and states. In order to represent the way
               people speak about their mental processes, it was necessary
               to add to the usual ideas of memory structure the notion of
               Immediate Memory. It is then argued that there are only three
               primitive mental ACTs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/265/CS-TR-72-265.pdf

%R CS-TR-72-267
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Mathematical Programming Language: an appraisal based on
               practical experiments.
%A Bonzon, Pierre E.
%D March 1972
%X The newly proposed Mathematical Programming Language is
               approached from the user's point of view. To demonstrate its
               facility of use, three programs are presented which solve
               large scale linear programming problems with the generalized
               upper-bounding structure.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/267/CS-TR-72-267.pdf

%R CS-TR-72-268
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Degrees and matchings.
%A Chvatal, Vaclav
%D March 1972
%X Let n, b, d be positive integers. D. Hanson proposed to
               evaluate f(n,b,d), the largest possible number of edges in a
               graph with n vertices having no vertex of degree greater than
               d and no set of more than b independent edges. Using the
               alternating path method, he found partial results in this
               direction. We complete Hanson's work; our proof technique has
               a linear programming flavor and uses Berge's matching
               formula.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/268/CS-TR-72-268.pdf

%R CS-TR-72-269
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Arithmetic properties of certain recursively defined sets.
%A Klarner, David A.
%A Rado, Richard
%D March 1972
%X Let R denote a set of linear operations defined on the set P
               of positive integers; for example, a typical element of R has
               the form $\rho (x_1, \ldots ,x_r) = m_0 + m_1 x_1 + \ldots +
               m_r x_r where m_0, \ldots ,m_r$ denote certain integers.
               Given a set A of positive integers, there is a smallest set
               of positive integers denoted <R:A> which contains A as a
               subset and is closed under every operation in R. The set
               <R:A> can be constructed recursively as follows: Let $A_0$ =
               A, and define
               $A_{k+1} = A_k \cup \{\rho (\bar{a}): \rho \in R,\bar{a}\in
               A_k \times \ldots \times A_k\}$ (k = 0,1,\ldots ),
               then it can be shown that <R:A> = $A_0 \cup A_1 \cup \ldots
               $. The sets <R:A> sometimes have an elegant form, for
               example, the set <2x + 3y: 1> consists of all positive
               numbers congruent to 1 or 5 modulo 12. The objective is to
               give an arithmetic characterization of elements of a set
               <R:A>, and this paper is a report on progress made on this
               problem last year. Many of the questions left open here have
               since been resolved by one of us (Klarner).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/269/CS-TR-72-269.pdf

%R CS-TR-72-270
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Lanczos algorithm for the symmetric Ax = $\lambda$Bx
               problem.
%A Golub, Gene H.
%A Underwood, Richard R.
%A Wilkinson, James H.
%D March 1972
%X The problem of computing the eigensystem of Ax = $\lambda$Bx
               when A and B are symmetric and B is positive definite is
               considered. A generalization of the Lanczos algorithm for
               reducing the problem to a symmetric tridiagonal eigenproblem
               is given. A numerically stable variant of the algorithm is
               described. The new algorithm depends heavily upon the
               computation of elementary Hermitian matrices. An ALGOL W
               procedure and a numerical example are also given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/270/CS-TR-72-270.pdf

%R CS-TR-72-272
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fixpoint approach to the theory of computation.
%A Manna, Z ohar
%A Vuillemin, Jean
%D March 1972
%X Following the fixpoint theory of Scott, we propose to define
               the semantics of computer programs in terms of the least
               fixpoints of recursive programs. This allows one not only to
               justify all existing verification techniques, but also to
               extend them to handle various properties of computer
               programs, including correctness, termination and equivalence,
               in a uniform manner.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/272/CS-TR-72-272.pdf

%R CS-TR-72-273
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Chromatic automorphisms of graphs.
%A Chvatal, Vaclav
%A Sichler, Jiri
%D March 1972
%X The coloring group and the full automorphism group of an
               n-chromatic graph are independent if and only if n is an
               integer $\geq$ 3.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/273/CS-TR-72-273.pdf

%R CS-TR-72-274
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Linear combinations of sets of consecutive integers.
%A Klarner, David A.
%A Rado, Richard
%D March 1972
%X Let k-1,$m_1, \ldots ,m_k$ denote non-negative integers, and
               suppose the greatest common divisor of $m_1, \ldots ,m_k$ is
               1. We show that if $S_1, \ldots ,S_k$ are sufficiently long
               blocks of consecutive integers, then the set $m_1 S_1 +
               \ldots\ m_k S_k$ contains a sizable block of consecutive
               integers. For example, if m and n are relatively prime
               natural numbers, and u, U, v, V are integers with U-u $\geq$
               n-1, V-v $geq$ m-1, then the set m{u,u+1, $\ldots$,U} +
               n{v,v+1, $\ldots$,V} contains the set {mu + nv -
               $\sigma$(m,n), $\ldots$,mU + nV - $\sigma$(m,n)} where
               $\sigma${m,n) = (m-1)(n-1) is the largest number such that
               $\sigma$(m,n)-1 cannot be expressed in the form mx + ny with
               x and y non-negative integers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/274/CS-TR-72-274.pdf

%R CS-TR-72-275
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Sets generated by iteration of a linear operation.
%A Klarner, David A.
%D March 1972
%X This note is a continuation of the paper "Arithmetic
               properties of certain recursively defined sets," written in
               collaboration with Richard Rado. Here the sets under
               consideration are those having the form S = <$m_1 x_1 +
               \ldots\ + m_r x_r : 1$> where $m_1,\ldots ,m_r$ are given
               natural numbers with greatest common divisor 1. The set S is
               the smallest set of natural numbers which contains 1 and is
               closed under the operation $m_1 x_1 + \ldots\ + m_r x_r$.
               Also, S can be constructed by iterating the operation $m_1
               x_1 + \ldots\ + m_r X_r$ over the set {1}. For example, <2x +
               3y: 1> = {1, 5, 13, 17, 25, $\ldots\ } = (1 + 12N) $\cup$ (5
               + 12N) where N = {0,1,2,$\ldots$}. It is shown in this note
               that S contains an infinite arithmetic progression for all
               natural numbers r-1,$m_1,\ldots ,m_r$. Furthermore, if
               ($m_1,\ldots ,m_r$) = ($m_1\ldots m_r,m_1 + \ldots\ + m_r$) =
               1, then S is a per-set; that is, S is a finite union of
               infinite arithmetic progressions. In particular, this implies
               <mx + ny: 1> is a per-set for all pairs {m,n} of relatively
               prime natural numbers. It is an open question whether S is a
               per-set when ($m_1,\ldots ,m_r$) = 1, but ($m_1\ldots m_r,m_1
               + \ldots\ + m_r$) > 1.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/275/CS-TR-72-275.pdf

%R CS-TR-72-278
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Use of fast direct methods for the efficient numerical
               solution of nonseparable elliptic equations.
%A Concus, Paul
%A Golub, Gene H.
%D April 1972
%X We study an iterative technique for the numerical solution of
               strongly elliptic equations of divergence form in two
               dimensions with Dirichlet boundary conditions on a rectangle.
               The technique is based on the repeated solution by a fast
               direct method of a discrete Helmholtz equation on a uniform
               rectangular mesh. The problem is suitably scaled before
               iteration, and Chebyshev acceleration is applied to improve
               convergence. We show that convergence can be exceedingly
               rapid and independent of mesh size for smooth coefficients.
               Extensions to other boundary conditions, other equations, and
               irregular mesh spacings are discussed, and the performance of
               the technique is illustrated with numerical examples.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/278/CS-TR-72-278.pdf

%R CS-TR-72-279
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Topics in optimization.
%A Osborne, Michael R.
%D April 1972
%X These notes are based on a course of lectures given at
               Stanford, and cover three major topics relevant to
               optimization theory. First an introduction is given to those
               results in mathematical programming which appear to be most
               important for the development and analysis of practical
               algorithms. Next unconstrained optimization problems are
               considered. The main emphasis is on that subclass of descent
               methods which (a) requires the evaluation of first
               derivatives of the objective function, and (b) has a family
               connection with the conjugate direction methods. Numerical
               results obtained using a program based on this material are
               discussed in an Appendix. In the third section, penalty and
               barrier function methods for mathematical programming
               problems are studied in some detail, and possible methods for
               accelerating their convergence indicated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/279/CS-TR-72-279.pdf

%R CS-TR-72-281
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computer interactive picture processing.
%A Quam, Lynn H.
%A Liebes, Sidney, Jr.
%A Tucker, Robert B.
%A Hannah, Marsha Jo
%A Eross, Botond G.
%D April 1972
%X This report describes work done in image processing using an
               interactive computer system. Techniques for image
               differencing are described and examples using images returned
               from Mars by the Mariner Nine spacecraft are shown. Also
               described are techniques for stereo image processing. Stereo
               processing for both conventional camera systems and the
               Viking 1975 Lander camera system is reviewed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/281/CS-TR-72-281.pdf

%R CS-TR-72-282
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Efficient compilation of linear recursive programs.
%A Chandra, Ashok K.
%D April 1972
%X We consider the class of linear recursive programs. A linear
               recursive program is a set of procedures where each procedure
               can make at most one recursive call. The conventional stack
               implementation of recursion requires time and space both
               proportional to n, the depth of recursion. It is shown that
               in order to implement linear recursion so as to execute in
               time n one doesn't need space proportional to n: $n^\epsilon$
               for arbitrarily small $\epsilon$ will do. It is also known
               that with constant space one can implement linear recursion
               in time $n^2$. We show that one can do much better:
               $n^{1+\epsilon}$ for arbitrarily small $\epsilon$. We also
               describe an algorithm that lies between these two: it takes
               time n.log(n) and space log(n).
               It is shown that several problems are closely related to the
               linear recursion problem, for example, the problem of
               reversing an input tape given a finite automaton with several
               one-way heads. By casting all these problems into a canonical
               form, efficient solutions are obtained simultaneously for
               all.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/282/CS-TR-72-282.pdf

%R CS-TR-72-284
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Edmonds polyhedra and a hierarchy of combinatorial problems.
%A Chvatal, Vaclav
%D May 1972
%X Let S be a set of linear inequalities that determine a
               bounded polyhedron P. The closure of S is the smallest set of
               inequalities that contains S and is closed under two
               operations: (i) taking linear combinations of inequalities,
               (ii) replacing an inequality $\sum\ a_j x_j \leq\ a_0$, where
               $a_1, a_2, ... , a_n$ are integers, by the inequality $\sum\
               a_j x_j \leq\ a$ with $a \geq\ [a_0]$. Obviously, if integers
               $x_1, x_2, ... , x_n$ satisfy all the inequalities in S then
               they satisfy also all the inequalities in the closure of S.
               Conversely, let $\sum\ c_j x_j \leq\ c_0$ hold for all
               choices of integers $x_1, x_2, ... , x_n$, that satisfy all
               the inequalities in S. Then we prove that $\sum\ c_j x_j
               \leq\ c_0$ belongs to the closure of S. To each integer
               linear programming problem, we assign a nonnegative integer,
               called its rank. (The rank is the minimum number of
               iterations of the operation (ii) that are required in order
               to eliminate the integrality constraint.) We prove that there
               is no upper bound on the rank of problems arising from the
               search for largest independent sets in graphs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/284/CS-TR-72-284.pdf

%R CS-TR-72-286
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the solution of Moser's problem in four dimensions, and
               related issues. A collection of two papers: On the solution
               of Moser's problem in four dimensions and Independent
               permutations as related to a problem of Moser and a theorem
               of Polya.
%A Chandra, Ashok K.
%D May 1972
%X The problem of finding the largest set of nodes in a d-cube
               of side 3 such that no three nodes are collinear was proposed
               by Moser. Small values of d (viz., $d \leq\ 3$) resulted in
               elegant symmetric solutions. It is shown that this does not
               remain the case in 4 dimensions where at most 43 nodes can be
               chosen, and these must not include the center node.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/286/CS-TR-72-286.pdf

%R CS-TR-72-288
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Logic for Computable Functions: description of a machine
               implementation.
%A Milner, Robin
%D May 1972
%X This paper is primarily a user's manual for LCF, a
               proof-checking program for a logic of computable functions
               proposed by Dana Scott in 1969 but unpublished by him. We use
               the name LCF also for the logic itself, which is presented at
               the start of the paper. The proof-checking program is
               designed to allow the user interactively to generate formal
               proofs about computable functions and functionals over a
               variety of domains, including those of interest to the
               computer scientist - for example, integers, lists and
               computer programs and their semantics. The user's task is
               alleviated by two features: a subgoaling facility and a
               powerful simplification mechanism. Applications include
               proofs of program correctness and in particular of compiler
               correctness; these applications are not discussed herein, but
               are illustrated in the papers referenced in this
               introduction.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/288/CS-TR-72-288.pdf

%R CS-TR-72-289
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Lakoff on linguistics and natural logic.
%A Wilks, Yorick A.
%D June 1972
%X The paper examines and criticises Lakoff's notions of a
               natural logic and of a generative semantics described in
               terms of logic. I argue that the relationship of these
               notions to logic as normally understood is unclear, but I
               suggest, in the course of the paper, a number of possible
               interpretations of his thesis of generative semantics. I
               argue further that on these interpretations the thesis (of
               Generative Semantics) is false, unless it be taken as a mere
               notational variant of Chomskyan theory. I argue, too, that
               Lakoff's work may provide a service in that it constitutes a
               reductio ad absurdum of the derivational paradigm of modern
               linguistics; and shows, inadvertently, that only a system
               with the ability to reconsider its own inferences can do the
               job that Lakoff sets up for linguistic enquiry -- that is to
               say, only an "artificial intelligence" system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/289/CS-TR-72-289.pdf

%R CS-TR-72-290
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Adverbs and belief.
%A Schank, Roger C.
%D June 1972
%X The treatment of a certain class of adverbs in conceptual
               representation is given. Certain adverbs are shown to be
               representative of complex belief structures. These adverbs
               serve as pointers that explain where the sentence that they
               modify belongs in a belief structure.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/290/CS-TR-72-290.pdf

%R CS-TR-72-291
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Some combinatorial lemmas.
%A Knuth, Donald E.
%D June 1972
%X This report consists of several short papers which are
               completely independent of each other:
               1. "Wheels Within Wheels." Every finite strongly connected
               digraph is either a single point or a set of n smaller
               strongly connected digraphs joined by an oriented cycle of
               length n. This result is proved in somewhat stronger form,
               and two applications are given.
               2. "An Experiment in Optimal Sorting." An unsuccessful
               attempt, to sort 13 or 14 elements in less comparisons than
               the Ford-Johnson algorithm, is described. (Coauthor: E.B.
               Kaehler.)
               3. "Permutations With Nonnegative Partial Sums." A sequence
               of s positive and t negative real numbers, whose sum is zero,
               can be arranged in at least (s+t-1)! and at most
               (s+t)!/(max(s,t)+1) < 2(s+t-1)! ways such that the partial
               sums $x_1 + ... + x_j$ are nonnegative for $1 \leq\ j \leq\
               s+t$.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/291/CS-TR-72-291.pdf

%R CS-TR-72-292
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Selected combinatorial research problems.
%A Chvatal, Vaclav
%A Klarner, David A.
%A Knuth, Donald E.
%D June 1972
%X Thirty-seven research problems are described, covering a wide
               range of combinatorial topics. Unlike Hilbert's problems,
               most of these are not especially famous and they might be
               "do-able" in the next few years.
               (Problems 1-16 were contributed by Klarner, 17-26 by Chvatal,
               27-37 by Knuth. All cash awards are Chvatal's
               responsibility.)
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/292/CS-TR-72-292.pdf

%R CS-TR-72-299
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T semantic categories of nominals for conceptual dependency
               analysis of natural language.
%A Russell, Sylvia Weber
%D July 1972
%X A system for the semantic categorization of conceptual
               objects (nominals) is provided. The system is intended to aid
               computer understanding of natural language. Specific
               implementations for "noun-pairs" and prepositional phrases
               are offered.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/299/CS-TR-72-299.pdf

%R CS-TR-72-300
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Counterexample to a conjecture of Fujii, Kasami and Ninomiya.
%A Kaufman, Marc T.
%D June 1972
%X In a recent paper [1], Fujii, Kasami and Ninomiya presented a
               procedure for the optimal scheduling of a system of unit
               length tasks represented as a directed acyclic graph on two
               identical processors. The authors conjecture that the
               algorithm can be extended to the case where more than two
               processors are employed. This note presents a counterexample
               to that conjecture.
               [1] Fujii, M., T. Kasami and K. Ninomiya, "Optimal Sequencing
               of Two Equivalent Processors, SIAM J. Appl. Math., Vol. 17,
               No.4, July 1969, pp. 784-789.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/300/CS-TR-72-300.pdf

%R CS-TR-72-301
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Product form of the Cholesky factorization for large-scale
               linear programming.
%A Saunders, Michael A.
%D August 1972
%X A variation of Gill and Murray's version of the revised
               simplex algorithm is proposed, using the Cholesky
               factorization ${BB}^T = {LDL}^T$ where B is the usual basis,
               D is diagonal and L is unit lower triangular. It is shown
               that during change of basis L may be updated in product form.
               As with standard methods using the product form of inverse,
               this allows use of sequential storage devices for
               accumulating updates to L. In addition the favorable
               numerical properties of Gill and Murray's algorithm are
               retained.
               Cloase attention is given to efficient out-of-core
               implementation. In the case of large-scale block-angular
               problems, the updates to L will remain very sparse for all
               iterations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/301/CS-TR-72-301.pdf

%R CS-TR-72-304
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Richardson's non-stationary matrix iterative procedure.
%A Anderssen, Robert S.
%A Golub, Gene H.
%D August 1972
%X Because of its simplicity, Richardson's non-stationary
               iterative scheme is a potentially powerful method for the
               solution of (linear) operator equations. However, its general
               application has more or less been blocked by
               (a) the problem of constructing polynomials, which deviate
               least from zero on the spectrum of the given operator, and
               which are required for the determination of the iteration
               parameters of the non-stationary method, and
               (b) the instability of this scheme with respect to rounding
               error effects.
               Recently, these difficulties were examined in two Russian
               papers. In the first, Lebedev [1969] constructed polynomials
               which deviate least from zero on a set of subintervals of the
               real axis which contains the spectrum of the given operator.
               In the second, Lebedev and Finogenov [1971] gave an ordering
               for the iteration parameters of the non-stationary Richardson
               scheme which makes it a stable numerical process. Translation
               of these two papers appear as Appendices 1 and 2,
               respectively, in this report. The body of the report
               represents an examination of the properties of Richardson's
               non-stationary scheme and the pertinence of the two mentioned
               papers along with the results of numerical experimentation
               testing the actual implementation of the procedures given in
               them.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/304/CS-TR-72-304.pdf

%R CS-TR-72-306
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A bibliography on computer graphics.
%A Pollack, Bary W.
%D August 1972
%X This bibliography includes the most important works
               describing the softwre aspects of generative computer
               graphics. As such it will be of most usefullness to
               researchers, system designers and programmers whose interests
               and responsibilities include the development of software
               systems for interactive graphical input/output. The
               bibliography does include a short section on hardware
               systems. Image analysis, pattern recognition and picture
               processing and related fields are rather poorly represented
               here. The interested researcher is referred to journals in
               this field and to the reports of Azriel Rosenfeld, University
               of Maryland, which include excellent bibliographic
               references.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/306/CS-TR-72-306.pdf

%R CS-TR-72-307
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Hadamard transform for speech wave analysis.
%A Tanaka, Hozumi
%D August 1972
%X Two methods of speech wave analysis using the Hadamard
               transform are discussed. The first method is a direct
               application of the Hadamard transform for speech waves. The
               reason this method yields poor results is discussed. The
               second method is the application of the Hadamard transform to
               a log-magnitude frequency spectrum. After the application of
               the Fourier transform the Hadamard transform is applied to
               detect a pitch period or to get a smoothed spectrum. This
               method shows some positive aspects of the Hadamard transform
               for the analysis of a speech wave with regard to the
               reduction of processing time required for smoothing, but at
               the cost of precision. A formant tracking program for voiced
               speech is implemented by using this method and an edge
               following technique used in scene analysis.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/307/CS-TR-72-307.pdf

%R CS-TR-72-308
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Recent developments in SAIL, an algol-based language for
               artificial intelligence.
%A Feldman, Jerome A.
%A Low, James R.
%A Swinehart, Daniel C.
%A Taylor, Russell H.
%D November 1972
%X New features added to SAIL, an ALGOL based language for the
               PDP-10, are discussed. The features include: procedure
               variables; multiple processes; coroutines; a limited form of
               backtracking; an event mechanism for inter-process
               communication; and matching procedures, a new way of
               searching the LEAP associative data base.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/308/CS-TR-72-308.pdf

%R CS-TR-72-310
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Anomalies in scheduling unit-time tasks.
%A Kaufman, Marc T.
%D June 1972
%X In this paper we examine the problem of scheduling a set of
               tasks on a system with a number of identical processors.
               Several timing anomalies are known to exist for the general
               case, in which the execution time can increase when
               inter-task constraints are removed or processors are added.
               It is shown that these anomalies also exist when tasks are
               restricted to be of equal (unit) length. Several,
               increasingly restrictive, heuristic scheduling algorithms are
               reviewed. The "added processor" anomaly is shown to persist
               through all of them, though in successively weaker form.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/310/CS-TR-72-310.pdf

%R CS-TR-72-317
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An analysis of drum storage units.
%A Fuller, Samuel H.
%A Baskett, Forest
%D August 1972
%X This article discusses the modeling and analysis of drum-like
               storage units. Two common forms of drum organizations and two
               common scheduling disciplines are considered: the file drum
               and the paging drum; first-in-first-out (FIFO) scheduling and
               shortest-latency-time-first (SLTF) scheduling.
               The modeling of the I/O requests to the drum is an important
               aspect of this analysis. Measurements are presented to
               indicate that it is realistic to model requests for records,
               or blocks of information to a file drum, as requests that
               have starting addresses uniformly distributed around the
               circumference of the drum and transfer times that are
               exponentially distributed with a mean of 1/2 to 1/3 of a drum
               revolution. The arrival of I/O requests is first assumed to
               be a Poisson process and then generalized to the case of a
               computer system with a finite degree of multiprogramming.
               An exact analysis of all the models except the SLTF file drum
               is presented; in this case the complexity of the drum
               organization has forced us to accept an approximate analysis.
               In order to examine the error introduced into the analysis of
               the SLTF file drum by our approximations, the results of the
               analytic models are compared to a simulation model of the
               SLTF file drum.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/317/CS-TR-72-317.pdf

%R CS-TR-72-318
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Constructive graph labeling using double cosets.
%A Brown, Harold
%A Masinter, Larry M.
%A Hjelmeland, Larry
%D October 1972
%X Two efficient computer implemented algorithms are presented
               for explicitly constructing all distinct labelings of a graph
               G with a set of (not necessarily distinct) labels L, given
               the symmetry group B of G. Two recursive reductions of the
               problem and a precomputation involving certain orbits of
               stabilizer subgroups are the techniques used by the
               algorithm. Moreover, for each labeling, the subgroup of B
               which preserves that labeling is calculated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/318/CS-TR-72-318.pdf

%R CS-TR-72-319
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On a characterization of the best $\ell_2$ scaling of a
               matrix.
%A Golub, Gene H.
%A Varah, James M.
%D October 1972
%X This paper is concerned with best two-sided scaling of a
               general square matrix, and in particular with a certain
               characterization of that best scaling: namely that the first
               and last singular vectors (on left and right) of the scaled
               matrix have components of equal modulus. Necessity,
               sufficiency, and its relation with other characterizations
               are discussed. Then the problem of best scaling for
               rectangular matrices is introducted and a conjecture made
               regarding a possible best scaling. The conjecture is verified
               for some special cases.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/319/CS-TR-72-319.pdf

%R CS-TR-72-320
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Winged edge polyhedron representation.
%A Baumgart, Bruce G.
%D October 1972
%X A winged edge polyhedron representation is stated and a set
               of primitives that preserve Euler's F-E+V = 2 equation are
               explained. Present use of this representation in artificial
               intelligence for computer graphics and world modeling is
               illustrated and its intended future application to computer
               vision is described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/320/CS-TR-72-320.pdf

%R CS-TR-72-322
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Methods for modifying matrix factorizations.
%A Gill, Phillip E.
%A Golub, Gene H.
%A Murray, Walter A.
%A Saunders, Michael A.
%D November 1972
%X In recent years several algorithms have appeared for
               modifying the factors of a matrix following a rank-one
               change. These methods have always been given in the context
               of specific applications and this has probably inhibited
               their use over a wider field. In this report several methods
               are described for modifying Cholesky factors. Some of these
               have been published previously while others appear for the
               first time. In addition, a new algorithm is presented for
               modifying the complete orthogonal factorization of a general
               matrix, from which the conventional QR factors are obtained
               as a special case. A uniform notation has been used and
               emphasis has been placed on illustrating the similarity
               between different methods.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/322/CS-TR-72-322.pdf

%R CS-TR-72-323
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A fast method for solving a class of tri-diagonal linear
               systems.
%A Malcolm, Michael A.
%A Palmer, John
%D November 1972
%X The solution of linear systems having real, symmetric,
               diagonally dominant, tridiagonal coefficient matrices with
               constant diagonals is considered. It is proved that the
               diagonals of the LU decomposition of the coefficient matrix
               rapidly converge to full floating-point precision. It is also
               proved that the computed LU decomposition converges when
               floating-point arithmetic is used and that the limits of the
               LU diagonals using floating point are roughly within machine
               precision of the limits using real arithmetic. This fact is
               exploited to reduce the number of floating-point operations
               required to solve a linear system from 8n-7 to 5n+2k-3, where
               k is much less than n, the order of the matrix. If the
               elements of the sub- and superdiagonals are 1, then only
               4n+2k-3 operations are needed. The entire LU decomposition
               takes k words of storage, and considerable savings in array
               subscripting are achieved. Upper and lower bounds on k are
               obtained in terms of the ratio of the coefficient matrix
               diagonal constants and parameters of the floating-point
               number system.
               Various generalizations of these results are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/323/CS-TR-72-323.pdf

%R CS-TR-72-325
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Review of Hubert Dreyfus' What Computers Can't Do: a Critique
               of Artificial Reason (Harper & Row, New York, 1972).
%A Buchanan, Bruce G.
%D November 1972
%X The recent book $\underline{What Computers Can't Do}$ by
               Hubert Dreyfus is an attack on artificial intelligence
               research. This review takes the position that the
               philosophical content of the book is interesting, but that
               the attack on artificial intelligence is not well reasoned.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/325/CS-TR-72-325.pdf

%R CS-TR-72-326
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Can expert judges, using transcripts of teletyped psychiatric
               interviews, distinguish human paranoid patients from a
               computer simulation of paranoid processes?
%A Colby, Kenneth Mark
%A Hilf, Franklin Dennis
%D December 1972
%X Expert judges (psychiatrists and computer scientists) could
               not correctly distinguish a simulation model of paranoid
               processes from actual paranoid patients.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/326/CS-TR-72-326.pdf

%R CS-TR-72-328
%Z Mon, 16 Oct 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An efficient implementation of Edmonds' maximum matching
               algorithm.
%A Gabow, Harold N.
%D June 1972
%X A matching in a graph is a collection of edges, no two of
               which share a vertex. A maximum matching contains the
               greatest number of edges possible. This paper presents an
               efficient implementation of Edmonds' algorithm for finding
               maximum matchings. The computation time is proportional to
               $V^3$, where V is the number of vertices; previous algorithms
               have computation time proportional to $V^4$. The
               implementation avoids Edmonds' blossom reduction by using
               pointers to encode the structure of alternating paths.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/72/328/CS-TR-72-328.pdf

%R CS-TR-71-188
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The translation of 'go to' programs to 'while' programs
%A Ashcroft, Edward A.
%A Manna, Z ohar
%D January 1971
%X In this paper we show that every flowchart program can be
               written without $underline{go to}$ statements by using
               $underline{while}$ statements. The main idea is to introduce
               new variables to preserve the values of certain variables at
               particular points in the program; or alternatively, to
               introduce special boolean variables to keep information about
               the course of the computation.
               The 'while' programs produced yield the same final results as
               the original flowchart program but need not perform
               computations in exactly the same way. However, the new
               programs do preserve the 'topology' of the original flowchart
               program, and are of the same order of efficiency.
               We also show that this cannot be done in general without
               adding variables.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/188/CS-TR-71-188.pdf

%R CS-TR-71-189
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Mathematical theory of partial correctness
%A Manna, Z ohar
%D January 1971
%X In this work we show that it is possible to express most
               properties regularly observed in algorithms in terms of
               'partial correctness' (i.e., the property that the final
               results of the algorithm, if any, satisfy some given
               input-output relation).
               This result is of special interest since 'partial
               correctness' has already been formulated in predicate
               calculus and in partial function logic for many classes of
               algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/189/CS-TR-71-189.pdf

%R CS-TR-71-190
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An n log n algorithm for minimizing states in a finite
               automaton
%A Hopcroft, John E.
%D January 1971
%X An algorithm is given for minimizing the number of states in
               a finite automaton or for determining if two finite automata
               are equivalent. The asymptotic running time of the algorithm
               is bounded by k n log n where k is some constant and n is the
               number of states. The constant k depends linearly on the size
               of the input alphabet.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/190/CS-TR-71-190.pdf

%R CS-TR-71-191
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An introduction to the direct emulation of control structures
               by a parallel micro-computer
%A Lesser, Victor R.
%D January 1971
%X This paper is an investigation of the organization of a
               parallel micro-computer designed to emulate a wide variety of
               sequential and parallel computers. This micro-computer allows
               tailoring of its control structure so that it is appropriate
               for the particular computer to be emulated. The control
               structure of this micro-computer is dynamically modified by
               changing the organization of its data structure for control.
               The micro-computer contains six primitive operators which
               dynamically manipulate and generate a tree type data
               structure for control. This data structure for control is
               used as a syntactic framework within which particular
               implementations of control concepts, such as iteration,
               recursion, co-routines, parallelism, interrupts, etc., can be
               easily expressed. The major features of the control data
               structure and the primitive operators are: (1) once the fixed
               control and data linkages among processes have been defined,
               they need not be rebuilt on subsequent executions of the
               control structure; (2) micro-programs may be written so that
               they execute independently of the number of physical
               processors present and still take advantage of available
               processors; (3) control structures for I/O processes,
               data-accessing processes, and computational processes are
               expressed in a single uniform framework. An emulator
               programmed on this micro-computer works as an iterative
               two-step process similar to the process of dynamic
               compilation or run time macro-expansion. This dynamic
               compilation approach to emulation differs considerably from
               the conventional approach to emulation, and provides a
               unifying approach to the emulation of a wide variety of
               sequential and parallel computers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/191/CS-TR-71-191.pdf

%R CS-TR-71-192
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An n log n algorithm for isomorphism of planar triply
               connected graphs
%A Hopcroft, John E.
%D January 1971
%X It is shown that the isomorphism problem for triply connected
               planar graphs can be reduced to the problem of minimizing
               states in a finite automaton. By making use of an n log n
               algorithm for minimizing the number of states in a finite
               automaton, an algorithm for determining whether two planar
               triply connected graphs are isomorphic is developed. The
               asymptotic growth rate of the algorithm grows as n log n
               where n is the number of vertices in the graph.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/192/CS-TR-71-192.pdf

%R CS-TR-71-193
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Intention, memory, and computer understanding
%A Schank, Roger C.
%D January 1971
%X Procedures are described for discovering the intention of a
               speaker by relating the Conceptual Dependency representation
               of the speaker's utterance to the computer's world model such
               that simple implications can be made. These procedures
               function at levels higher than that of the sentence by
               allowing for predictions based on context and the structure
               of the memory. Computer understanding of natural language is
               shown to consist of the following parts: assigning a
               conceptual representation to an input; relating that
               representation to the memory such as to extract the intention
               of the speaker; and selecting the correct response type
               triggered by such an utterance according to the situation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/193/CS-TR-71-193.pdf

%R CS-TR-71-195
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The direct solution of the discrete Poisson equation on
               irregular regions
%A Buzbee, B. L.
%A Dorr, Fred W.
%A George, John Alan
%A Golub, Gene H.
%D December 1970
%X There are several very fast direct methods which can be used
               to solve the discrete Poisson equation on rectangular
               domains. We show that these methods can also be used to treat
               problems on irregular regions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/195/CS-TR-71-195.pdf

%R CS-TR-71-197
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MIX/360 user's guide
%A Knuth, Donald E.
%A Sites, Richard L.
%D March 1971
%X MIX/360 is an assembler and simulator for the hypothetical
               MIX machine, which is described for example in Knuth's
               $\underline{The Art of Computer Programming}$, Section 1.3.1.
               The system contains several debugging aids to help program
               construction and verification.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/197/CS-TR-71-197.pdf

%R CS-TR-71-201
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Planarity testing in V log V steps: extended abstract
%A Hopcroft, John E.
%A Tarjan, Robert Endre
%D February 1971
%X An efficient algorithm is presented for determining whether
               or not a given graph is planar. If V is the number of
               vertices in the graph, the algorithm requires time
               proportional to V log V and space proportional to V when run
               on a random-access computer. The algorithm constructs the
               facial boundaries of a planar representation without backup,
               using extensive list-processing features to speed
               computation. The theoretical time bound improves on that of
               previously published algorithms. Experimental evidence
               indicates that graphs with a few thousand edges can be tested
               within seconds.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/201/CS-TR-71-201.pdf

%R CS-TR-71-202
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Communicating semaphores
%A Saal, Harry J.
%A Riddle, William E.
%D February 1971
%X This paper describes two extensions to the semaphore
               operators originally introduced by Dijkstra. These extensions
               can be used to reduce: 1) the number of semaphore references;
               2) the time spent in critical sections; and 3) the number of
               distinct semaphores required for proper synchronization
               without greatly increasing the time required for semaphore
               operations. Communicating semaphores may be utilized not only
               for synchronization but also for message switching, resource
               allocation from pools and as general queueing mechanisms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/202/CS-TR-71-202.pdf

%R CS-TR-71-203
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Heuristic DENDRAL program for explaining empirical data
%A Buchanan, Bruce G.
%A Lederberg, Joshua
%D February 1971
%X The Heuristic DENDRAL program uses an information processing
               model of scientific reasoning to explain experimental data in
               organic chemistry. This report summarizes the organization
               and results of the program for computer scientists. The
               program is divided into three main parts: planning, structure
               generation, and evaluation.
               The planning phase infers constraints on the search space
               from the empirical data input to the system. The structure
               generation phase searches a tree whose termini are models of
               chemical molecules using pruning heuristics of various kinds.
               The evaluation phase tests the candidate structures against
               the original data. Results of the program's analyses of some
               test data are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/203/CS-TR-71-203.pdf

%R CS-TR-71-204
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T FETE: a Fortran execution time estimator
%A Ingalls, Daniel H. H.
%D February 1971
%X If you want to live cheaply, you must make a list of how much
               money is spent on each thing every day. This enumeration will
               quickly reveal the principal areas of waste. The same method
               works for saving computer time. Originally, one had to put
               his own timers and counters into a program to determine the
               distribution of time spent in each part. Recently several
               automated systems have appeared which either insert counters
               automatically or interrupt the program during its execution
               to produce the tallies. FETE is a system of the former type
               which has two outstanding characteristics: it is very easy to
               implement and it is very easy to use. By demonstrating such
               convenience, it should establish execution timing as a
               standard tool in program development.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/204/CS-TR-71-204.pdf

%R CS-TR-71-205
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An algebraic definition of simulation between programs
%A Milner, Robin
%D February 1971
%X A simulation relation between programs is defined which is
               quasi-ordering. Mutual simulation is then an equivalence
               relation, and by dividing out by it we abstract from a
               program such details as how the sequencing is controlled and
               how data is represented. The equivalence classes are
               approxiamtions to the algorithms which are realized, or
               expressed, by their member programs.
               A technique is given and illustrated for proving simulation
               and equivalence of programs; there is an analogy with Floyd's
               technique for proving correctness of programs. Finally,
               necessary and sufficient conditions for simulation are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/205/CS-TR-71-205.pdf

%R CS-TR-71-207
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Efficient algorithms for graph manipulation
%A Hopcroft, John E.
%A Tarjan, Robert Endre
%D March 1971
%X Efficient algorithms are presented for partitioning a graph
               into connected components, biconnected components and simple
               paths. The algorithm for partitioning of a graph into simple
               paths is iterative and each iteration produces a new path
               between two vertices already on paths. (The start vertex can
               be specified dynamically.) If V is the number of vertices and
               E is the number of edges each algorithm requires time and
               space proportional to max(V,E) when executed on a random
               access computer.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/207/CS-TR-71-207.pdf

%R CS-TR-71-209
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Project technical report
%A McCarthy, John
%A Samuel, Arthur L.
%A Feigenbaum, Edward A.
%A Lederberg, Joshua
%D March 1971
%X An overview is presented of current research at Stanford in
               artificial intelligence and heuristic programming. This
               report is largely the text of a proposal to the Advanced
               Research Projects Agency for fiscal years 1972-73.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/209/CS-TR-71-209.pdf

%R CS-TR-71-210
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ACCESS: a program for the catalog and access of information
%A Purdy, J. Gerry
%D March 1971
%X ACCESS is a program for the catalog and access of
               information. The program is primarily designed for and
               intended to handle a personal library, although larger
               applications are possible. ACCESS produces a listing of all
               entries by locator code (so one knows where to find the entry
               in his library), a listing of entry titles by user-specified
               category codes, and a keyword-in-context KWIC listing (each
               keyword specified by the user). ACCESS is presently
               programmed in FORTRAN and operates on any IBM System/360
               under OS (it uses the IBM SORT/MERGE package). It is
               anticipated a machine language version (soon to be
               implemented) will greatly decrease the running time of the
               program.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/210/CS-TR-71-210.pdf

%R CS-TR-71-211
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Algorithms to reveal properties of floating-point arithmetic
%A Malcolm, Michael A.
%D March 1971
%X Two algorithms are presented in the form of Fortran
               subroutines. Each subroutine computes the radix and number of
               digits of the floating-point numbers and whether rounding or
               chopping is done by the machine on which it is run. The
               methods are shown to work on any "reasonable" floating-point
               computer.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/211/CS-TR-71-211.pdf

%R CS-TR-71-212
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Time and memory requirements for solving linear systems
%A Morgana, Maria Aurora
%D March 1971
%X The Computer Science Department program library contains a
               number of ALGOL W procedures and FORTRAN subroutines which
               can be used to solve systems of linear equations.
               This report describes the results of tests to determine the
               amount of time and memory required to solve systems of
               various orders.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/212/CS-TR-71-212.pdf

%R CS-TR-71-213
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The switchyard problem: sorting using networks of queues and
               stacks
%A Tarjan, Robert Endre
%D April 1971
%X The problem of sorting a sequence of numbers using a network
               of queues and stacks is presented. A characterization of
               sequences sortable using parallel queues is given, and
               partial characterizations of sequences sortable using
               parallel stacks and networks of queues are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/213/CS-TR-71-213.pdf

%R CS-TR-71-215
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T PL360 (revised): a programming language for the IBM 360
%A Malcolm, Michael A.
%D May 1972
%X In 1968, N. Wirth (Jan. JACM) published a formal description
               of PL360, a programming language designed specifically for
               the IBM 360. PL360 has an appearance similar to that of
               Algol, but it provides the facilities of a symbolic machine
               language. Since 1968, numerous extensions and modifications
               have been made to the PL360 compiler which was originally
               designed and implemented by N. Wirth and J. Wells. Interface
               and input-output subroutines have been written which allow
               the use of PL360 under OS, DOS, MTS and Orvyl.
               A formal description of PL360 as it is presently implemented
               is given. The description of the language is followed by
               sections on the use of PL360 under various operating systems,
               namely OS, DOS and MTS. Instructions on how to use the PL360
               compiler and PL360 programs in an interactive mode under the
               Orvyl time-sharing monitor are also included.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/215/CS-TR-71-215.pdf

%R CS-TR-71-217
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Decidable properties of monadic functional schemas
%A Ashcroft, Edward A.
%A Manna, Z ohar
%A Pneuli, Amir
%D July 1971
%X We define a class of (monadic) functional schemas which
               properly includes 'Ianov' flowchart schemas. We show that the
               termination, divergence and freedom problems for functional
               schemas are decidable. Although it is possible to translate a
               large class of non-free functional schemas into equivalent
               free functional schemas, we show that this cannot be done in
               general. We show also that the equivalence problem for free
               functional schemas is decidable. Most of the results are
               obtained from well-known results in Formal Languages and
               Automata Theory.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/217/CS-TR-71-217.pdf

%R CS-TR-71-221
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A heuristic programming study of theory formation in science
%A Buchanan, Bruce G.
%A Feigenbaum, Edward A.
%A Lederberg, Joshua
%D July 1971
%X The Meta-DENDRAL program is a vehicle for studying problems
               of theory formation in science. The general strategy of
               Meta-DENDRAL is to reason from data to plausible
               generalizations and then to organize the generalizations into
               a unified theory. Three main subproblems are discussed: (1)
               explain the experimental data for each individual chemical
               structure, (2) generalize the results from each structure to
               all structures, and (3) organize the generalizations into a
               unified theory. The program is built upon the concepts and
               programmed routines already available in the Heuristic
               DENDRAL performance program, but goes beyond the performance
               program in attempting to formulate the theory which the
               performance program will use.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/221/CS-TR-71-221.pdf

%R CS-TR-71-224
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Parallel programming
%A Ershov, Andrei P.
%D July 1971
%X This report is based on lectures given at Stanford University
               by Dr. Ershov in November, 1970.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/224/CS-TR-71-224.pdf

%R CS-TR-71-225
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Numerical methods for computing angles between linear
               subspaces
%A Bjoerck, Ake
%A Golub, Gene H.
%D July 1971
%X Assume that two subspaces F and G of unitary space are
               defined as the ranges (or nullspaces) of given rectangular
               matrices A and B. Accurate numerical methods are developed
               for computing the principal angles $\theta_k (F,G)$ and
               orthogonal sets of principal vectors $u_k\ \epsilon\ F$ and
               $v_k\ \epsilon\ G$, k = 1,2,..., q = dim(G) $\leq$ dim(F). An
               important application in statistics is computing the
               canonical correlations $\sigma_k\ = cos \theta_k$ between two
               sets of variates. A perturbation analysis shows that the
               condition number for $\theta_k$ essentially is max($\kappa
               (A),\kappa (B)$), where $\kappa$ denotes the condition number
               of a matrix. The algorithms are based on a preliminary
               QR-factorization of A and B (or $A^H$ and $B^H$), for which
               either the method of Householder transformations (HT) or the
               modified Gram-Schmidt method (MGS) is used. Then cos
               $\theta_k$ and sin $\theta_k$ are computed as the singular
               values of certain related matrices. Experimental results are
               given, which indicates that MGS gives $\theta_k$ with equal
               precision and fewer arithmetic operations than HT. However,
               HT gives principal vectors, which are orthogonal to working
               accuracy, which is not in general true for MGS. Finally the
               case when A and/or B are rank deficient is discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/225/CS-TR-71-225.pdf

%R CS-TR-71-226
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T SIMPLE: a simple precedence translator writing system
%A George, James E.
%D July 1971
%X SIMPLE is a translator writing system composed of a simple
               precedence syntax analyzer and a semantic constructor and is
               implemented in PL/I. It provides an error diagnostic and
               recovery mechanism for any system implemented using SIMPLE.
               The removal of precedence conflicts is discussed in detail
               with several examples.
               The utilization of SIMPLE is illustrated by defining a
               command language meta system for the construction of scanners
               for a wide variety of command oriented languages. This meta
               system is illustrated by defining commands from several text
               editors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/226/CS-TR-71-226.pdf

%R CS-TR-71-228
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Function minimization and automatic therapeutic control
%A Kaufman, Linda C.
%D July 1971
%X No abstract available.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/228/CS-TR-71-228.pdf

%R CS-TR-71-229
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Variational study of nonlinear spline curves
%A Lee, Erastus H.
%A Forsythe, George E.
%D August 1971
%X This is an exposition of the variational and differential
               properties of nonlinear spline curves, based on the
               Euler-Bernoulli theory for the bending of thin beams or
               elastica. For both open and closed splines through prescribed
               nodal points in the euclidean plane, various types of nodal
               constraints are considered, and the corresponding algebraic
               and differential equations relating curvature, angle, arc
               length, and tangential force are derived in a simple manner.
               The results for closed splines are apparently new, and they
               cannot be derived by the consideration of a constrained
               conservative system. There is a survey of the scanty recent
               literature.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/229/CS-TR-71-229.pdf

%R CS-TR-71-230
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ALGOL W reference manual
%A Sites, Richard L.
%D February 1972
%X "A Contribution to the Development of ALGOL" by Niklaus Wirth
               and C. A. R. Hoare was the basis for a compiler developed for
               the IBM 360 at Stanford University. This report is a
               description of the implemented language, ALGOL W. Historical
               background and the goals of the language may be found in the
               Wirth and Hoare paper.
               This manual refers to the version of the Algol W compiler
               dated 16 January 1972.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/230/CS-TR-71-230.pdf

%R CS-TR-71-234
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Some modified eigenvalue problems
%A Golub, Gene H.
%D August 1971
%X We consider the numerical calculation of several eigenvalue
               problems which require some manipulation before the standard
               algorithms may be used. This includes finding the stationary
               values of a quadratic form subject to linear constraints and
               determining the eigenvalues of a matrix which is modified by
               a matrix of rank one. We also consider several inverse
               eigenvalue problems. This includes the problem of computing
               the Gauss-Radau and Gauss-Lobatto quadrature rules. In
               addition, we study several eigenvalue problems which arise in
               least squares.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/234/CS-TR-71-234.pdf

%R CS-TR-71-236
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Numerical computations for univariate linear models
%A Golub, Gene H.
%A Styan, George P. H.
%D September 1971
%X We consider the usual univariate linear model E($\underset
               ~\to y$) = $\underset ~\to X \underset ~\to \gamma$ , V
               ($\underset ~\to y$) = $\sigma^2 \underset ~\to I$. In Part
               One of this paper $\underset ~\to X$ has full column rank.
               Numerically stable and efficient computational procedures are
               developed for the least squares estimation of $\underset ~\to
               \gamma$ and the error sum of squares. We employ an orthogonal
               triangular decomposition of $\underset ~\to X$ using
               Householder transformations. A lower bound for the condition
               number of $\underset ~\to X$ is immediately obtained from
               this decomposition. Similar computational procedures are
               presented for the usual F-test of the general linear
               hypothesis $\underset ~\to L\ ' \underset ~\to \gamma$ =
               $\underset ~\to 0$ ; $\underset ~\to L\ ' \underset ~\to
               \gamma$ = $\underset ~\to m$ is also considered for
               $\underset ~\to m\ \neq\ 0$. Updating techniques are given
               for adding to or removing from ($\underset ~\to X ,\underset
               ~\to y$) a row, a set of rows or a column .
               In Part Two, $\underset ~\to X$ has less than full rank.
               Least squares estimates are obtained using generalized
               inverses. The function $\underset ~\to L '\underset ~\to
               \gamma$ is estimable whenever it admits an unbiased estimator
               linear in $\underset ~\to y$. We show how to computationally
               verify estimability of $\underset ~\to L '\underset ~\to
               \gamma$ and the equivalent testability of $\underset ~\to L
               '\underset ~\to \gamma\ = \underset ~\to 0$.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/236/CS-TR-71-236.pdf

%R CS-TR-71-237
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A generalization of the divide-sort-merge strategy for
               sorting networks
%A Van Voorhis, David C.
%D August 1971
%X With a few notable exceptions the best sorting networks known
               have employed a "divide-sort-merge" strategy. That is, the N
               inputs are divided into 2 groups - - normally of size $\lceil
               \frac{1}{2} N\rceil$ and $\lfloor \frac{1}{2} N\rfloor$ [Here
               $\lceil x\rceil$ denotes the smallest integer greater than or
               equal to x, whereas $\lfloor x\rfloor$ denotes the largest
               integer less than or equal to x] - - that are sorted
               independently and then "merged" together to form a single
               sorted sequence. An N-sorter network that uses this strategy
               consists of 2 smaller sorting networks followed by a merge
               network. The best merge networks known are also constructed
               recursively, using 2 smaller merge networks followed by a
               simple arrangement of $\lceil \frac{1}{2} N\rceil$ - 1
               comparators.
               We consider a generalization of the divide-sort-merge
               strategy in which the N inputs are divided into g $\geq$ 2
               disjoint groups that are sorted independently and then merged
               together. The merge network that combines these g sorted
               groups uses d $\geq$ 2 smaller merge networks as an initial
               subnetwork. The two parameters g and d together define what
               we call a "[g,d]" strategy.
               A [g,d] N-sorter network consists of g smaller sorting
               networks followed by a [g,d] merge network. The initial
               portion of the [g,d] merge network consists of d smaller
               merge networks; the final portion, which we call the
               "f-network," includes whatever additional comparators are
               required to complete the merge. When g = d = 2, the f-network
               is a simple arrangement of $\lceil \frac{1}{2} N\rceil$ - 1
               comparators; however, for larger g,d the structure of the
               [g,d] f-network becomes increasingly complicated.
               In this paper we describe how to construct [g,d] f-networks
               for arbitrary g,d. For N > 8 the resulting [g,d] N-sorter
               networks are more economical than any previous networks that
               use the divide-sort-merge strategy; for N > 34 the resulting
               networks are more economical than previous networks of any
               construction. The [4,4] N-sorter network described in this
               paper requires $\frac{1}{4} N{(log_2 N)}^2\ - \frac{1}{3}
               N(log_2 N) + O(N)$ comparators, which represents an
               asymptotic improvement of $\frac{1}{12} N(log_2 N)$
               comparators over the best previous N-sorter. We indicate that
               special constructions (not described in this paper) have been
               found for [$2^r , 2^r$] f-networks, which lead to an N-sorter
               network that requires only .25 $N{(log_2 N)}^2\ - .372
               N(log_2 N) + O(N)$ comparators.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/237/CS-TR-71-237.pdf

%R CS-TR-71-238
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A lower bound for sorting networks that use the
               divide-sort-merge strategy
%A Van Voorhis, David C.
%D August 1971
%X Let $M_g (g^{k+1})$ represent the minimum number of
               comparators required by a network that merges g sorted
               multisets containing $g^k$ members each. In this paper we
               prove that $M_g (g^{k+1}) \geq\ g M_g(g^k) + g^{k-1}
               \sum_{\ell =2}^{g} \lfloor (\ell -1)g/\ell\rfloor$. From this
               relation we are able to show that an N-sorter network which
               uses the g-way divide-sort-merge strategy must contain at
               least order $N{(log_2 N)}^2$ comparators.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/238/CS-TR-71-238.pdf

%R CS-TR-71-239
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Large [g,d] sorting networks
%A Van Voorhis, David C.
%D August 1971
%X With only a few exceptions the minimum-comparator N-sorter
               networks employ the generalized "divide-sort-merge" strategy.
               That is, the N inputs are divided among g $\geq$ 2 smaller
               sorting networks -- of size $N_1,N_2,...,N_g$, where $N =
               \sum_{k=1}^{g} N_k$ -- that comprise the initial portion of
               the N-sorter network. The remainder of the N-sorter is a
               comparator network that merges the outputs of the $N_1-,
               N_2-, ...,$ and $N_g$-sorter networks into a single sorted
               sequence. The most economical merge networks yet designed,
               known as the "[g,d]" merge networks, consist of d smaller
               merge networks -- where d is a common divisor of
               $N_1,N_2,...,N_g$ -- followed by a special comparator network
               labeled a "[g,d] f-network." In this paper we describe
               special constructions for $[2^r,2^r]$ f-networks, r > 1,
               which enable us to reduce the number of comparators required
               by a large N-sorter network from $.25N {log_2 N)}^2 -
               .25N(log_2 N) + O(N) to .25N{(log_2 N)}^2 - .37N(log_2 N) +
               O(N)$.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/239/CS-TR-71-239.pdf

%R CS-TR-71-240
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Correctness of two compilers for a Lisp subset
%A London, Ralph L.
%D October 1971
%X Using mainly structural induction, proofs of correctness of
               each of two running Lisp compilers for the PDP-10 computer
               are given. Included are the rationale for presenting these
               proofs, a discussion of the proofs, and the changes needed to
               the second compiler to complete its proof.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/240/CS-TR-71-240.pdf

%R CS-TR-71-242
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The frame problem and related problems in artificial
               intelligence
%A Hayes, Patrick J.
%D November 1971
%X The frame problem arises in considering the logical structure
               of a robot's beliefs. It has been known for some years, but
               only recently has much progress been made. The problem is
               described and discussed. Various suggested methods for its
               solution are outlined, and described in a uniform notation.
               Finally, brief consideration is given to the problem of
               adjusting a belief system in the face of evidence which
               contradicts beliefs. It is shown that a variation on the
               situation notation of (McCarthy and Hayes, 1969) permits an
               elegant approach, and relates this problem to the frame
               problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/242/CS-TR-71-242.pdf

%R CS-TR-71-246
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A resemblance test for the validation of a computer
               simulation of paranoid processes
%A Colby, Kenneth Mark
%A Hilf, Franklin Dennis
%A Weber, Sylvia
%A Kraemer, Helena C.
%D November 1971
%X A computer simulation of paranoid processes in the form of a
               dialogue algorithm was subjected to a validation study using
               an experimental resemblance test in which judges rated
               degrees of paranoia present in initial psychiatric interviews
               of both paranoid patients and of versions of the paranoid
               model. The statistical results indicate a satisfactory degree
               of resemblance between the two groups of interviews. It is
               concluded that the model provides a successful simulation of
               naturally occuring paranoid processes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/246/CS-TR-71-246.pdf

%R CS-TR-71-247
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T One small head -- some remarks on the use of 'model' in
               linguistics
%A Wilks, Yorick A.
%D December 1971
%X I argue that the present situation in formal linguistics,
               where much new work is presented as being a "model of the
               brain", or of "human language behavior", is an undesirable
               one. My reason for this judgement is not the conservative
               (Braithwaitian) one that the entities in question are not
               really models but theories. It is rather that they are called
               models because they cannot be theories of the brain at the
               present stage of brain research, and hence that the use of
               "model" in this context is not so much aspirational as
               resigned about our total ignorance of how the brain stores
               and processes linguistic information. The reason such
               explanatory entities cannot be theories is that this
               ignorance precludes any "semantic ascent" up the theory;
               i.e., interpreting the items of the theory in terms of
               observables. And the brain items, whatever they may be, are
               not, as Chomsky has sometimes claimed, in the same position
               as the "occult entities" of Physics like Gravitation; for the
               brain items are not theoretically unreachable, merely
               unreached.
               I then examine two possible alternate views of what
               linguistic theories should be proffered as theories of:
               theories of sets of sentences, and theories of a particular
               class of algorithms. I argue for a form of the latter view,
               and that its acceptance would also have the effect of making
               Computational Linguistics a central part of Linguistics,
               rather than the poor relation it is now.
               I examine a distinction among "linguistic models" proposed
               recently by Mey, who was also arguing for the
               self-sufficiency of Computational Linguistics, though as a
               "theory of performance". I argue that his distinction is a
               bad one, partly for the reasons developed above and partly
               because he attempts to tie it to Chomsky's inscrutable
               competence-performance distinction. I conclude that the
               independence and self-sufficiency of Computational
               Linguistics are better supported by the arguments of this
               paper.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/247/CS-TR-71-247.pdf

%R CS-TR-71-249
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An annotated bibliography on the construction of compilers
%A Pollack, Bary W.
%D December 1971
%X This bibliography is divided into 9 sections:
               1. General Information on Compiling Techniques 2. Syntax- and
               Base-Directed Parsing 3. Parsing in General 4. Resource
               Allocation 5. Errors - Detection and Correction 6. Compiler
               Implementation in General 7. Details of Compiler Construction
               8. Additional Topics 9. Miscellaneous Related References
               Within each section the entries are alphabetical by author.
               Keywords describing the entry will be found for each entry
               set off by pound signs (#).
               Some amount of cross-referencing has been done; e.g., entries
               which fall into Section 3 as well as Section 7 will generally
               be found in both sections. However, entries will be found
               listed only under the principle or first author's name.
               "Computing Reviews" citations are given following the
               annotation when available.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/249/CS-TR-71-249.pdf

%R CS-TR-71-250
%Z Wed, 01 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Program schemas with equality
%A Chandra, Ashok K.
%A Manna, Z ohar
%D December 1971
%X We discuss the class of program schemas augmented with
               equality tests, that is, tests of equality between terms.
               In the first part of the paper we discuss and illustrate the
               "power" of equality tests. It turns out that the class of
               program schemas with equality is more powerful than the
               "maximal" classes of schemas suggested by other
               investigators.
               In the second part of the paper we discuss the decision
               problems of program schemas with equality. It is shown for
               example that while the decision problems normally considered
               for schemas (such as halting, divergence, equivalence,
               isomorphism and freedom) are solvable for Ianov schemas, they
               all become unsolvable if general equality tests are added. We
               suggest, however, limited equality tests which can be added
               to certain subclasses of program schemas while preserving
               their solvable properties.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/250/CS-TR-71-250.pdf

%R CS-TR-70-146
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Roundoff error analysis of the fast Fourier transform
%A Ramos, George U.
%D February 1970
%X This paper presents an analysis of roundoff errors occurring
               in the floating-point computation of the fast Fourier
               transform. Upper bounds are derived for the ratios of the
               root-mean-square (RMS) and maximum roundoff errors in the
               output data to the RMS value of the input data for both
               single and multidimensional transformations. These bounds are
               compared experimentally with actual roundoff errors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/146/CS-TR-70-146.pdf

%R CS-TR-70-147
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Pitfalls in computation, or why a math book isn't enough
%A Forsythe, George E.
%D January 1970
%X The floating-point number system is contrasted with the real
               numbers. The author then illustrates the variety of
               computational pitfalls a person can fall into who merely
               translates information gained from pure mathematics courses
               into computer programs. Examples include summing a Taylor
               series, solving a quadratic equation, solving linear
               algebraic systems, solving ordinary and partial differential
               equations, and finding polynomial zeros. It is concluded that
               mathematics courses should be taught with a greater awareness
               of automatic computation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/147/CS-TR-70-147.pdf

%R CS-TR-70-150
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Elementary proof of the Wielandt-Hoffman Theorem and of its
               generalization
%A Wilkinson, James H.
%D January 1970
%X An elementary proof is given of the Wielandt-Hoffman Theorem
               for normal matrices and of a generalization of this theorem.
               The proof makes no direct appeal to results from
               linear-programming theory.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/150/CS-TR-70-150.pdf

%R CS-TR-70-151
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T "On the Properties of the Derivatives of the Solutions of
               Laplace's Equation and the Errors of the Method of Finite
               Differences for Boundary Values in $C_2$ and $C_{1,1}$" by E.
               A. Volkov
%A Volkov, E. A.
%A Forsythe, George E.
%D January 1970
%X If a function u is harmonic in a circular disk and its
               boundary values are twice continuously differentiable, u need
               not have bounded second derivatives in the open disk. For the
               Dirichlet problem for Laplace's equation in a more general
               two-dimensional region the discretization error of the
               ordinary method of finite differences is studied, when
               Collatz's method of linear interpolation is used at the
               boundary. If the boundary of the region has a tangent line
               whose angle satisfies a Lipschitz condition, and if the
               boundary values have a first derivative satisfying a
               Lipschitz condition, then the discretization error is shown
               to be of order $h^2 ln h^{-1}$. This bound is shown to be
               sharp. By a different method of interpolation at the boundary
               one can improve the bound to o($h^2$). There are other
               similar results.
               Translated by G. E. Forsythe.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/151/CS-TR-70-151.pdf

%R CS-TR-70-155
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The method of odd/even reduction and factorization with
               application to Poisson's equation, part II
%A Buzbee, B. L.
%A Golub, Gene H.
%A Nielson, C. W.
%D March 1970
%X In this paper, we derive and generalize the methods of
               Buneman for solving elliptic partial difference equations in
               a rectangular region. We show why the Buneman methods lead to
               numerically accurate solutions whereas the CORF algorithm may
               be numerically unstable. Several numerical examples are given
               and discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/155/CS-TR-70-155.pdf

%R CS-TR-70-156
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On a model for computing round-off error of a sum
%A Dantzig, George B.
%D March 1970
%X No abstract available.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/156/CS-TR-70-156.pdf

%R CS-TR-70-157
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Algorithms for matrix multiplication
%A Brent, Richard P.
%D March 1970
%X Strassen's and Winograd's algorithms for matrix
               multiplication are investigated and compared with the normal
               algorithm. Floating-point error bounds are obtained, and it
               is shown that scaling is essential for numerical accuracy
               using Winograd's method. In practical cases Winograd's method
               appears to be slightly faster than the other two methods, but
               the gain is, at most, about 20%. Finally, an attempt to
               generalize Strassen's method is described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/157/CS-TR-70-157.pdf

%R CS-TR-70-159
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The use of direct methods for the solution of the discrete
               Poisson equation on non-rectangular regions
%A George, John Alan
%D June 1970
%X Some direct and iterative schemes are presented for solving a
               standard finite-difference scheme for Poisson's equation on a
               two-dimensional bounded region R with Dirichlet conditions
               specified on the boundary $\delta$R. These procedures make
               use of special-purpose direct methods for solving rectangular
               Poisson problems. The region is imbedded in a rectangle and a
               uniform mesh is superimposed on it. The usual five-point
               Poisson difference operator is applied over the whole
               rectangle, yielding a block-tridiagonal system of equations.
               The original problem, however, determines only the elements
               of the right-hand side which correspond to grid points lying
               within $\delta$R; the remaining elements can be treated as
               parameters. The iterative algorithms construct a sequence of
               right-hand sides in such a way that the corresponding
               sequence of solutions on the rectangle converges to the
               solution of the imbedded problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/159/CS-TR-70-159.pdf

%R CS-TR-70-160
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A model for parallel computer systems
%A Bredt, Thomas H.
%A McCluskey, Edward J.
%D April 1970
%X A flow table model is defined for parallel computer systems.
               In this model, fundamental-mode flow tables are used to
               describe the operation of system componenets, which may be
               programs or circuits. Components communicate by changing the
               values on interconnecting lines which carry binary level
               signals. It is assumed that there is no bound on the time for
               value changes to propagate over the interconnecting lines.
               Given this delay assumption, it is necessary to specify a
               mode of operation for system components such that input
               changes which arrive while a component is unstable do not
               affect the operation of the component. Such a mode of
               operation is specified. Using the flow table model, a new
               control algorithm for the two-process mutual exclusion
               problem is designed. This algorithm does not depend on the
               exclusive execution of any primitive operations used in its
               implementation. A circuit implementation of the control
               algorithm is described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/160/CS-TR-70-160.pdf

%R CS-TR-70-162
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Numerical techniques in mathematical programming
%A Bartels, Richard H.
%A Golub, Gene H.
%A Saunders, Michael A.
%D May 1970
%X The application of numerically stable matrix decompositions
               to minimization problems involving linear constraints is
               discussed and shown to be feasible without undue loss of
               efficiency.
               Part A describes computation and updating of the product-form
               of the LU decomposition of a matrix and shows it can be
               applied to solving linear systems at least as efficiently as
               standard techniques using the product-form of the inverse.
               Part B discusses orthogonalization via Householder
               transformations, with applications to least squares and
               quadratic programming algorithms based on the principal
               pivoting method of Cottle and Dantzig.
               Part C applies the singular value decomposition to the
               nonlinear least squares problem and discusses related
               eigenvalue problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/162/CS-TR-70-162.pdf

%R CS-TR-70-163
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An algorithm for floating-point accumulation of sums with
               small relative error
%A Malcolm, Michael A.
%D June 1970
%X A practical algorithm for floating-point accumulation is
               presented. Through the use of multiple accumulators, errors
               due to cancellation are avoided. An example in Fortran is
               included. An error analysis providing a sharp bound on the
               relative error is also given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/163/CS-TR-70-163.pdf

%R CS-TR-70-164
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T "Estimates of the Roundoff Error in the Solution of a System
               of Conditional Equations" by V. I. Gordonova
%A Gordonova, V. I.
%A Kaufman, Linda C.
%D June 1970
%X Using backward error analysis, this paper compares the
               roundoff error in the least-squares solution of a system of
               conditional equations Ax=f by two different methods. The
               first one entails solving the normal equations $A^T$Ax=$A^T$f
               and the second is one proposed by Faddeev, Faddeeva, and
               Kublanovskaya in 1966. This latter method involves
               multiplying the system by orthogonal matrices to transform
               the matrix A into upper triangular form.
               Translated by Linda Kaufman.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/164/CS-TR-70-164.pdf

%R CS-TR-70-165
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The scheduling of n tasks with m operations on two processors
%A Bauer, Henry R.
%A Stone, Harold S.
%D July 1970
%X The job shop problem is one scheduling problem for which no
               efficient algorithm exists. That is, no algorithm is known in
               which the number of computational steps grow algebraically as
               the problem enlarges. This paper presents a discussion of the
               problem of scheduling N tasks on two processors when each
               task consists of three operations. The operations of each
               task must be performed in order and among the processors. We
               analyze this problem through four sub-problems. Johnson's
               scheduling algorithm is generalized to solve two of these
               sub-problems, and functional equation algorithms are used to
               solve the remaining two problems. Except for one case, the
               algorithms are efficient. The exceptional case has been
               labelled the "core" problem and the difficulties are
               described.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/165/CS-TR-70-165.pdf

%R CS-TR-70-170
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Analysis and synthesis of concurrent sequential programs
%A Bredt, Thomas H.
%D May 1970
%X This paper presents analysis and synthesis procedures for a
               class of sequential programs. These procedures aid in the
               design of programs for parallel computer systems. In
               particular, the interactions of a given program with other
               programs or circuits in a system can be described precisely.
               The basis for this work is a model for parallel computer
               systems in which the operation of each component is described
               by a flow table and the components interact by changing
               values on interconnecting lines. The details of this model
               are discussed in another paper [Stanford University
               Department of Computer Science report STAN-CS-70-160]. The
               analysis procedure produces a flow table description of a
               program. In program synthesis, a flow table description is
               converted to a sequential program. Using flow table design
               procedures, a control program for the two-program mutual
               exclusion problem is produced.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/170/CS-TR-70-170.pdf

%R CS-TR-70-171
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A survey of models for parallel computing
%A Bredt, Thomas H.
%D August 1970
%X The work of Adams, Karp and Miller, Luconi, and Rodriguez on
               formal models for parallel computations and computer systems
               is reviewed. A general definition of a parallel schema is
               given so that the similarities and differences of the models
               can be discussed. Primary emphasis is on the control
               structures used to achieve parallel operation and on
               properties of the models such as determinacy and equivalence.
               Decidable and undecidable properties are summarized.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/171/CS-TR-70-171.pdf

%R CS-TR-70-172
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Analysis of parallel systems
%A Bredt, Thomas H.
%D August 1970
%X A formal analysis procedure for parallel computer systems is
               presented. The flow table model presented in an earlier paper
               [Stanford University Department of Computer Science report
               STAN-CS-70-160] is used to describe a system. Each component
               to the system is described by a completely specified
               fundamental-mode flow table. All delays in a parallel system
               are assumed to be finite. Component delays are assumed to be
               bounded and line delays unbounded. The concept of an output
               hazard is introduced to account for the effects of line delay
               and the lack of synchronization among components. Necessary
               and sufficient conditions for the absence of output hazards
               are given.
               The state of a parallel system is defined by the present
               internal state and input state of each component. The
               operation of the system is described by a system state graph
               which specifies all possible state transitions for a
               specified initial system state. A procedure for constructing
               the system state graph is given. The analysis procedure may
               be summarized as follows. A problem is stated in terms of
               restrictions on system operation. A parallel system is said
               to operate correctly with respect to the given problem if the
               associated restrictions are always satisfied. The
               restrictions specify either forbidden system states, which
               are never to be entered during the operation of the system,
               or forbidden system state sequences, which must never appear
               during system operation. The restrictions are tested by
               examining the system state graph. A parallel system for the
               two-process mutual exclusion problem is analyzed and the
               system is shown to operate correctly with respect to this
               problem. Finally, the conditions of determinacy and output
               functionality, which have been used in other models of
               parallel computing, are discussed as they relate to correct
               solutions to the mutual exclusion problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/172/CS-TR-70-172.pdf

%R CS-TR-70-173
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The mutual exclusion problem
%A Bredt, Thomas H.
%D August 1970
%X This paper discusses how n components, which may be programs
               or circuits, in a computer system can be controlled so that
               (1) at most one component may perform a designated "critical"
               operation at any instant and (2) if one component wants to
               perform its critical operation, it is eventually allowed to
               do so. This control problem is known as the mutual exclusion
               or interlock problem. A summary of the flow table model
               [Stanford University Department of Computer Science report
               STAN-CS-70-160] for computer systems is given. In this model,
               a control algorithm is represented by a flow table. The
               number of internal states in the control flow table is used
               as a measure of the complexity of control algorithms. A lower
               bound of n + 1 internal states is shown to be necessary if
               the mutual exclusion problem is to be solved. Procedures to
               generate control flow tables for the mutual exclusion problem
               which require the minimum number of internal states are
               described and it is proved that these procedures given
               correct control solutions. Other so-called "unbiased"
               algorithms are described which require 2.n! internal states
               but break ties in the case of multiple requests in favor of
               the component that least recently executed its critical
               operation. The paper concludes with a discussion of the
               tradeoffs between central and distributed control algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/173/CS-TR-70-173.pdf

%R CS-TR-70-174
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Towards automatic program synthesis
%A Manna, Z ohar
%A Waldinger, Richard J.
%D July 1970
%X An elementary outline of the theorem-proving approach to
               automatic program synthesis is given, without dwelling on
               technical details. The method is illustrated by the automatic
               construction of both recursive and iterative programs
               operating on natural numbers, lists, and trees.
               In order to construct a program satisfying certain
               specifications, a theorem induced by those specifications is
               proved, and the desired program is extracted from the proof.
               The same technique is applied to transform recursively
               defined functions into iterative programs, frequently with a
               major gain in efficiency.
               It is emphasized that in order to construct a program with
               loops or with recursion, the principle of mathematical
               induction must be applied. The relation between the version
               of the induction rule used and the form of the program
               constructed is explored in some detail.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/174/CS-TR-70-174.pdf

%R CS-TR-70-175
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A description and comparison of subroutines for computing
               Euclidean inner products on the IBM 360
%A Malcolm, Michael A.
%D October 1970
%X Several existing subroutines and an Algol W procedure for
               computing inner products on the IBM 360, using more precision
               than long, are described and evaluated. Error bounds (when
               they exist) and execution timing tests are included.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/175/CS-TR-70-175.pdf

%R CS-TR-70-176
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On generality and problem solving: a case study using the
               DENDRAL program
%A Feigenbaum, Edward A.
%A Buchanan, Bruce G.
%A Lederberg, Joshua
%D August 1970
%X Heuristic DENDRAL is a computer program written to solve
               problems of inductive inference in organic chemistry. This
               paper will use the design of Heuristic DENDRAL and its
               performance on different problems for a discussion of the
               following topics:
               1. the design for generality; 2. the performance problems
               attendant upon too much generality; 3. the coupling of
               expertise to the general problem solving processes; 4. the
               symbiotic relationship between generality and expertness, and
               the implications of this symbiosis for the study and design
               of problem solving systems.
               We conclude the paper with a view of the design for a general
               problem solver that is a variant of the "big switch" theory
               of generality.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/176/CS-TR-70-176.pdf

%R CS-TR-70-178
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Research in the Computer Science Department and selected
               other research in computing at Stanford University
%A Forsythe, George E.
%A Miller, William F.
%D October 1970
%X The research program of the Computer Science Department can
               perhaps be best summarized in terms of its research projects.
               The chart on page ii lists the projects and the participation
               by faculty and students. The sections following the chart
               provide descriptions of the individual projects.
               There are a number of projects in other schools or
               departments which are making significant contributions to
               computer science; and these add to the total computer
               environment. Descriptions of a few of these projects are also
               included with this report. This list of projects outside of
               Computer Science does not purport to be complete or even
               representative.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/178/CS-TR-70-178.pdf

%R CS-TR-70-179
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MLISP
%A Smith, David Canfield
%D October 1970
%X MLISP is a high level list-processing and symbol-manipulation
               language based on the programming language LISP. MLISP
               programs are translated into LISP programs and then executed
               or compiled. MLISP exists for two purposes: (1) to facilitate
               the writing and understanding of LISP programs; (2) to remedy
               certain important deficiencies in the list-processing ability
               of LISP.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/179/CS-TR-70-179.pdf

%R CS-TR-70-183
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Machine learning through signature trees: applications to
               human speech
%A White, George M.
%D October 1970
%X Signature tree "machine learning", pattern recognition
               heuristics are investigated for the specific problem of
               computer recognition of human speech. When the data base of
               given utterances is insufficient to establish trends with
               confidence, a large number of feature extractors must be
               employed and "recognition" of an unknown pattern made by
               comparing its feature values with those of known patterns.
               When the data base is replete, a "signature" tree can be
               constructed and recognition can be achieved by the evaluation
               of a select few features. Learning results from selecting an
               optimal minimal set of features to achieve recognition.
               Properties of signature trees and the heuristics for this
               type of learning are of primary interest in this exposition.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/183/CS-TR-70-183.pdf

%R CS-TR-70-184
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A note on a conjecture of L. J. Mordell
%A Malcolm, Michael A.
%D November 1970
%X A computer proof is described for a previously unsolved
               problem concerning the inequality $\sum{i=1}{n} x_i/(x_{i+1}\
               + x_{i+2}) \geq\ frac{n}{2}$.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/184/CS-TR-70-184.pdf

%R CS-TR-70-185
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Graph program simulation
%A Nelson, Edward C.
%D October 1970
%X This reports the simulation of a parallel processing system
               based on a directed graph representation of parallel
               computations. The graph representation is based on the model
               developed by Duane Adams in which programs are written as
               directed graphs whose nodes represent operations and whose
               edges represent data flow. The first part of the report
               describes a simulator which interprets these graph programs.
               The second part describes the use of the simulator in a
               hypothetical environment which has an unlimited number of
               processors and an unlimited amount of memory. Three programs,
               a trapezoidal quadrature, a sort and a matrix multiplication,
               were used to study the effect of varying the relative speed
               of primitive operations on computation time with problem
               size. The system was able to achieve a high degree of
               parallelism. For example, the simulator multiplied two n by n
               matrices in a simulated time proportional to n.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/185/CS-TR-70-185.pdf

%R CS-TR-70-187
%Z Mon, 06 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MPL, Mathematical Programming Language: specification manual
               for Committee review
%A Eisenstat, Stanley C.
%A Magnanti, Thomas L.
%A Maier, Steven F.
%A McGrath, Michael B.
%A Nicholson, Vincent J.
%A Riedl, Christiane
%A Dantzig, George B.
%D November 1970
%X Mathematical Programming Language (MPL) is intended as a
               highly readable, user oriented, programming tool for use in
               the writing and testing of mathematical algorithms, in
               particular experimental algorithms for solving large-scale
               linear programs. It combines the simplicity of standard
               mathematical notation with the power of complex data
               structures. Variables may be implicitly introduced into a
               program by their use in the statement in which they first
               appear. No formal defining statement is necessary. Statements
               of the "let" and "where" type are part of the language.
               Included within the allowable data structures of MPL are
               matrices, partitioned matrices, and multidimensional arrays.
               Ordered sets are included as vectors with their constructs
               closely paralleling those found in set theory. Allocation of
               storage is dynamic, thereby eliminating the need for a data
               manipulating subset of the language, as is characteristic of
               most high level scientific programming languages.
               This report summarizes the progress that has been made to
               date in developing MPL. It contains a specification manual,
               examples of the application of the language, and the future
               directions and goals of the project.
               A version of MPL, called MPL/70, has been implemented using
               PL/I as a translator. This will be reported separately. Until
               fully implemented, MPL is expected to serve primarily as a
               highly readable communication language for mathematical
               algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/70/187/CS-TR-70-187.pdf

%R CS-TR-69-120
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MUTANT 0.5: an experimental programming language
%A Satterthwaite, Edwin H.
%D February 1969
%X A programming language which continues the extension and
               simplification of ALGOL 60 in the direction suggested by
               EULER is defined and described. Techniques used in an
               experimental implementation of that language, called MUTANT
               0.5, are briefly summarized. The final section of this report
               is an attempt to assess the potential value of the approach
               to procedural programming language design exemplified by
               MUTANT 0.5. Implementation and use of the experimental system
               have indicated a sufficient number of conceptual and
               practical problems to suggest that the general approach is of
               limited value; however, a number of specific features were
               found to be convenient, useful, and adaptable to other
               philosophies of language design.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/120/CS-TR-69-120.pdf

%R CS-TR-69-121
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Accurate bounds for the eigenvalues of the Laplacian and
               applications to rhombical domains
%A Moler, Cleve B.
%D February 1969
%X We deal with the eigenvalues and eigenfunctions of Laplace's
               differential operator on a bounded two-dimensional domain G
               with zero values on the boundary. The paper describes a new
               technique for determining the coefficients in the expansion
               of an eigenfunction in terms of particular eigenfunctions of
               the differential operator. The coefficients are chosen to
               make the sum of the expansion come close to satisfying the
               boundary conditions. As an example, the eigenvalues and
               eigenfunctions are determined for a rhombical membrane.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/121/CS-TR-69-121.pdf

%R CS-TR-69-122
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Heuristic analysis of numerical variants of the Gram-Schmidt
               orthonormalization process
%A Mitchell, William C.
%A McCraith, Douglas L.
%D February 1969
%X The Gram-Schmidt orthonormalization process is a fundamental
               formula of analysis which is notoriously unstable
               computationally. This report provides a heuristic analysis of
               the process, which shows why the method is unstable. Formulas
               are derived which describe the propagation of round-off error
               through the process. These formulas are supported by
               numerical experiments. These formulas are then applied to a
               computational variant of a basic method proposed by John R.
               Rice, and this method is shown to offer significant
               improvement over the basic algorithm. This finding is also
               supported by numerical experiment.
               The formulas for the error propagation are then used to
               produce a linear corrector for the basic Gram-Schmidt
               process, which shows significant improvement over both
               previous methods, but at the cost of slightly more
               computations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/122/CS-TR-69-122.pdf

%R CS-TR-69-124
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Matrix decompositions and statistical calculations
%A Golub, Gene H.
%D March 1969
%X Several matrix decompositions which are of some interest in
               statistical calculations are presented. An accurate method
               for calculating the canonical correlation is given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/124/CS-TR-69-124.pdf

%R CS-TR-69-125
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Grammatical complexity and inference
%A Feldman, Jerome A.
%A Gips, James
%A Horning, James J.
%A Reder, Stephen
%D June 1969
%X The problem of inferring a grammar for a set of symbol
               strings is considered and a number of new decidability
               results obtained. Several notions of grammatical complexity
               and their properties are studied. The question of learning
               the least complex grammar for a set of strings is
               investigated leading to a variety of positive and negative
               results. This work is part of a continuing effort to study
               the problems of representation and generalization through the
               grammatical inference question. Appendices A and B and
               Section 2a.0 are primarily the work of Reder, Sections 2b and
               3d of Horning, Section 4 and Appendix C of Gips, and the
               remainder the responsibility of Feldman.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/125/CS-TR-69-125.pdf

%R CS-TR-69-126
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Complementary spanning trees
%A Dantzig, George B.
%D March 1969
%X Given a network G whose arcs partition into
               non-overlapping 'clubs' (sets) $R_i$, D. Ray Fulkerson
               has considered the problem of constructing a spanning
               tree such that no two of its arcs belong to (represent)
               the same club and has stated necessary and sufficient
               conditions for such trees to exist. When each club
               $R_i$ consists of exactly two arcs, we shall refer to
               each of the arc pair as the 'complement' of the other,
               and the representative tree as a complementary tree.
               Our objective is to prove the following theorem: If
               there exists one complementary tree, there exists at
               least two.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/126/CS-TR-69-126.pdf

%R CS-TR-69-128
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The method of odd/even reduction and factorization with
               application to Poisson's equation
%A Buzbee, B. L.
%A Golub, Gene H.
%A Nielson, C. W.
%D April 1969
%X Several algorithms are presented for solving block
               tridiagonal systems of linear algebraic equations when the
               matrices on the diagonal are equal to each other and the
               matrices on the subdiagonals are all equal to each other. It
               is shown that these matrices arise from the finite difference
               approximation to certain elliptic partial differential
               equations on rectangular regions. Generalizations are derived
               for higher order equations and non-rectangular regions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/128/CS-TR-69-128.pdf

%R CS-TR-69-129
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Research in the Computer Science Department, Stanford
               University
%A Miller, William F.
%D April 1969
%X The research program of the Computer Science Department can
               perhaps be best summarized in terms of its research projects.
               The chart on the following page lists the projects and the
               participation by faculty and students. Two observations
               should be made to complete the picture. Within the Artificial
               Intelligence Project, the Stanford Computation Center, the
               SLAC Computation Group, and the INFO project, there are a
               large number of highly competent professional computer
               scientists who add greatly to the total capability of the
               campus. Also, there are a number of projects in other schools
               or departments which are making significant contributions to
               computer science. These, too, add to the total computer
               environment.
               Summarized by Professor W. F. Miller.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/129/CS-TR-69-129.pdf

%R CS-TR-69-134
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Linear least squares and quadratic programming
%A Golub, Gene H.
%A Saunders, Michael A.
%D May 1969
%X Several algorithms are presented for solving linear least
               squares problems; the basic tool is orthogonalization
               techniques. A highly accurate algorithm is presented for
               solving least squares problems with linear inequality
               constraints. A method is also given for finding the least
               squares solution when there is a quadratic constraint on the
               solution.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/134/CS-TR-69-134.pdf

%R CS-TR-69-135
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T CIL: Compiler Implementation Language
%A Gries, David
%D May 1969
%X This report is a manual for the proposed Compiler
               Implementation Language, CIL. It is not an expository paper
               on the subject of compiler writing or compiler-compilers. The
               language definition may change as work progresses on the
               project.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/135/CS-TR-69-135.pdf

%R CS-TR-69-137
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Fixed points of analytic functions
%A Henrici, Peter
%D July 1969
%X A continuous mapping of a simply connected, closed, bounded
               set of the euclidean plane into itself is known to have at
               least one fixed point. It is shown that the usual condition
               for the fixed point to be unique, and for convergence of the
               iteration sequence to the fixed point, can be relaxed if the
               mapping is defined by an analytic function of a complex
               variable.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/137/CS-TR-69-137.pdf

%R CS-TR-69-141
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Bounds for the error of linear systems of equations using the
               theory of moments
%A Dahlquist, Germund
%A Eisenstat, Stanley C.
%A Golub, Gene H.
%D October 1969
%X Consider the system of linear equations $A\underset ~\to x =
               \underset ~\to b$ where A is an n$\times$n real symmetric,
               positive definite matrix and $\underset ~\to b$ is a known
               vector. Suppose we are given an approximation to $\underset
               ~\to x$, $\underset ~\to \xi$, and we wish to determine upper
               and lower bounds for $\Vert \underset ~\to x\ - \underset
               ~\to \xi \Vert$ where $\Vert ...\Vert$ indicates the
               euclidean norm. Given the sequence of vectors ${\{ {\underset
               ~\to r}_i \} }^{k}_{i=0}$ where ${\underset ~\to r}_i\ =
               A{\underset ~\to r}_{i-1}$ and ${\underset ~\to r}_o\ =
               \underset ~\to b -A\underset ~\to \xi$, it is shown how to
               construct a sequence of upper and lower bounds for $\Vert
               \underset ~\to x\ - \underset ~\to \xi \Vert$ using the
               theory of moments.
               In addition, consider the Jacobi algorithm for solving the
               system $\underset ~\to x\ = M\underset ~\to x +\underset ~\to
               b \underline{viz.} {\underset ~\to x}_{i+1} = M{\underset
               ~\to x}_i +\underset ~\to b$. It is shown that by examining
               ${\underset ~\to \delta}_i\ = {\underset ~\to x}_{i+1} -
               {\underset ~\to x}_i , it is possible to construct upper and
               lower bounds for $\Vert {\underset ~\to x}_i -\underset ~\to
               x \Vert$.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/141/CS-TR-69-141.pdf

%R CS-TR-69-142
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stationary values of the ratio of quadratic forms subject to
               linear constraints
%A Golub, Gene H.
%A Underwood, Richard R.
%D November 1969
%X Let A be a real symmetric matrix of order n, B a real
               symmetric positive definite matrix of order n, and C an
               n$\times$p matrix of rank r with r $\leq$ p < n. We wish to
               determine vectors $\underset ~\to x$ for which ${\underset
               ~\to x}^T\ A\underset ~\to x\ / {\underset ~\to x}^T\
               B\underset ~\to x$ is stationary and $C^T \underset ~\to x\ =
               \underset ~\to \Theta$, the null vector. An algorithm is
               given for generating a symmetric eigensystem whose
               eigenvalues are the stationary values and for determining the
               vectors $\underset ~\to x$. Several Algol procedures are
               included.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/142/CS-TR-69-142.pdf

%R CS-TR-69-144
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The maximum and minimum of a positive definite quadratic
               polynomial on a sphere are convex functions of the radius
%A Forsythe, George E.
%D July 1969
%X It is proved that in euclidean n-space the maximum M($\rho$)
               and minimum m($\rho$) of a fixed positive definite quadratic
               polynomial Q on spheres with fixed center are both convex
               functions of the radius $\rho$ of the sphere. In the proof,
               which uses elementary calculus and a result of Forsythe and
               Golub, $m^" (\rho) and M^" (\rho)$ are shown to exist and lie
               in the interval [$2{\lambda}_1 ,2{\lambda}_n$], where
               ${\lambda}_i$ are the eigenvalues of the quadratic form of Q.
               Hence $m^" (\rho) > 0 and M^" (\rho) > 0$.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/144/CS-TR-69-144.pdf

%R CS-TR-69-145
%Z Mon, 27 Nov 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Methods of search for solving polynomial equations
%A Henrici, Peter
%D December 1969
%X The problem of determining a zero of a given polynomial with
               guaranteed error bounds, using an amount of work that can be
               estimated a priori, is attacked here by means of a class of
               algorithms based on the idea of systematic search. Lehmer's
               "machine method" for solving polynomial equations is a
               special case. The use of the Schur-Cohn algorithm in Lehmer's
               method is replaced by a more general proximity test which
               reacts positively if applied at a point close to a zero of a
               polynomial. Various such tests are described, and the work
               involved in their use is estimated. The optimality and
               non-optimality of certain methods, both on a deterministic
               and on a probabilistic basis, are established.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/69/145/CS-TR-69-145.pdf

%R CS-TR-68-83
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Iterative refinements of linear least squares solutions by
               Householder transformations
%A Bjorck, Ake
%A Golub, Gene H.
%D January 1968
%X An algorithm is presented in ALGOL for iteratively refining
               the solution to a linear least squares problem with linear
               constraints. Numerical results presented show that a high
               degree of accuracy is obtained.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/83/CS-TR-68-83.pdf

%R CS-TR-68-84
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A computer system for transformational grammar
%A Friedman, Joyce
%D January 1968
%X A comprehensive system for transformational grammar has been
               designed and is being implemented on the IBM 360/67 computer.
               The system deals with the transformational model of syntax,
               along the lines of Chomsky's "Aspects of the Theory of
               Syntax." The major innovations include a full and formal
               description of the syntax of a transformational grammar, a
               directed random phrase structure generator, a lexical
               insertion algorithm, and a simple problem-oriented
               programming language in which the algorithm for application
               of transformations can be expressed. In this paper we present
               the system as a whole, first discussing the philosophy
               underlying the development of the system, then outlining the
               system and discussing its more important special features.
               References are given to papers which consider particular
               aspects of the system in detail.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/84/CS-TR-68-84.pdf

%R CS-TR-68-85
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computer-aided language development in nonspeaking mentally
               disturbed children
%A Colby, Kenneth Mark
%D December 1967
%X Experience with a computer-based method for aiding language
               development in nonspeaking mentally disturbed children is
               described. Out of a group of 10 children 8 improved
               linguistically while 2 were unimproved. Problems connected
               with the method and its future prospects are briefly
               discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/85/CS-TR-68-85.pdf

%R CS-TR-68-86
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ALGOL W
%A Bauer, Henry R.
%A Becker, Sheldon I.
%A Graham, Susan L.
%D January 1968
%X The textbook "Introduction to Algol" by Baumann, Feliciano,
               Bauer, and Samelson describes the internationally recognized
               language ALGOL 60 for algorithm communication. ALGOL W can be
               viewed as an extension of ALGOL.
               This document consists of (1) "Algol W Notes for Introductory
               Computer Science Courses" [by Henry R. Bauer, Sheldon Becker,
               and Susan L. Graham] which describes the differences between
               ALGOL 60 and ALGOL W and presents the new features of ALGOL
               W; (2) "Deck Set-Up"; (3) "Algol W Language Description" [by
               Henry R. Bauer, Sheldon Becker, and Susan L. Graham], a
               complete syntactic and semantic description of the language;
               (4) "Unit Record Equipment"; and (5) "Error Message."
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/86/CS-TR-68-86.pdf

%R CS-TR-68-87
%Z Mon, 09 Oct 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T CS139 lecture notes. Part I: Sections 1 thru 21. Preliminary
               version
%A Ehrman, John R.
%D June 1968
%X These notes are meant to provide an introduction to the IBM
               System/360 which will help the reader to understand and to
               make effective use of the capabilities of both the machinery
               and some of its associated service programs. They are largely
               self-contained, and in general the reader should need to make
               only occasional reference to the "System/360 Principles of
               Operation" manual (IBM File No. S360-01, Form A22-6821) and
               to the "Operating System/360 Assembler Language" manual (IBM
               File No. S360-21, Form C28-6514).
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/87/CS-TR-68-87.pdf

%R CS-TR-68-88
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Relaxation methods for convex problems
%A Schechter, Samuel
%D February 1968
%X Extensions and simplifications are made for convergence
               proofs of relaxation methods for nonlinear systems arising
               from the minimization of strictly convex functions. This work
               extends these methods to group relaxation, which includes an
               extrapolated form of Newton's method, for various orderings.
               A relatively simple proof is given for cyclic orderings,
               sometimes referred to as nonlinear overrelaxation, and for
               residual orderings where an error estimate is given. A less
               restrictive choice of relaxation parameter is obtained than
               that previously. Applications are indicated primarily to the
               solution of nonlinear elliptic boundary problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/88/CS-TR-68-88.pdf

%R CS-TR-68-89
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ALGOL W (revised)
%A Bauer, Henry R.
%A Becker, Sheldon I.
%A Graham, Susan L.
%A Forsythe, George E.
%A Satterthwaite, Edwin H.
%D March 1968
%X The textbook "Introduction to Algol" by Baumann, Feliciano,
               Bauer, and Samelson describes the internationally recognized
               language ALGOL 60 for algorithm communication. ALGOL W can be
               viewed as an extension of ALGOL.
               This document consists of (1) "Algol W Deck Set-Up" [by E.H.
               Satterthwaite, Jr.]; (2) "Algol W Language Description" [by
               Henry R. Bauer, Sheldon Becker, and Susan L. Graham], a
               complete syntactic and semantic description of the language;
               (3) "Algol W Error Messages" [by Henry R. Bauer, Sheldon
               Becker, and Susan L. Graham]; (4) "Algol W Notes for
               Introductory Computer Science Courses" [by Henry R. Bauer,
               Sheldon Becker, and Susan L. Graham] which describes the
               differences tween ALGOL 60 and ALGOL W and presents the new
               features of ALGOL W; and (5) "Notes on Number Representation
               on System/360 and relations to Algol W" [by George E.
               Forsythe].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/89/CS-TR-68-89.pdf

%R CS-TR-68-90
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A multi-level computer organization designed to separate
               data-accessing from the computation
%A Lesser, Victor R.
%D March 1968
%X The computer organization to be described in this paper has
               been developed to overcome the inflexibility of computers
               designed around a few fixed data structures, and only binary
               operations. This has been accomplished by separating the
               data-accessing procedures from the computational algorithm.
               By this separation, a new and different language may be used
               to express data-accessing procedures. The new language has
               been designed to allow the programmer to define the
               procedures for generating the names of the operands for each
               computation, and locating the value of an operand given its
               name.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/90/CS-TR-68-90.pdf

%R CS-TR-68-91
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The PL360 system
%A Wirth, Niklaus E.
%A Wells, Joseph W.
%A Satterthwaite, Edwin H.
%D April 1968
%X This report describes the use of two operating systems which
               serve as environments for the PL360 language defined in the
               companion report [Niklaus Wirth, "A Programming Language for
               the 360 Computers," Stanford University Computer Science
               Department report CS 53 (revised), June 1967]. Some additions
               to that language, not described in CS 53, are documented in
               the Appendix.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/91/CS-TR-68-91.pdf

%R CS-TR-68-98
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ALGOL W implementation
%A Bauer, Henry R.
%A Becker, Sheldon I.
%A Graham, Susan L.
%D May 1968
%X In writing a compiler of a new language (ALGOL W) for a new
               machine (IBM System/360) we were forced to deal with many
               unforeseen problems in addition to the problems we expected
               to encounter. This report describes the final version of the
               compiler.
               The implemented language ALGOL W is based on the Wirth/Hoare
               proposal for a successor to ALGOL 60. The major differences
               from that proposal are in string definition and operations
               and in complex number representation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/98/CS-TR-68-98.pdf

%R CS-TR-68-100
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A computer model of information processing in children
%A Bredt, Thomas H.
%D June 1968
%X A model of cognitive information processing has been
               constructed on the basis of a protocol gathered from a child
               taking an object association test. The basic elements of the
               model are a graph-like data base and strategy. The data base
               contains facts that relate objects in the experiment. The
               graph distance that separates two objects in the data base is
               the measure of how well a relation is known. The strategy
               used in searching for facts that relate two objects is
               sequential in nature.
               The model has been programmed for computer testing in the
               LISP programming language. The responses of the computer
               model and the original subject are compared. To aid in the
               model evaluation a revised test was defined and administered
               to two children. The results were modeled and the
               correspondence of model and subject performance is discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/100/CS-TR-68-100.pdf

%R CS-TR-68-92
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MLISP
%A Enea, Horace J.
%D March 1968
%X Mlisp is an Algol-like list processing language based on Lisp
               1.5. It is currently implemented on the IBM 360/67 at the
               Stanford Computation Center, and is being implemented on the
               DEC PDP-6 at the Stanford Artificial Intelligence Project.
               The balance of this paper is a very informal presentation of
               the language so that the reader will be able to run programs
               in Mlisp with a minimum of effort. The language has an
               extremely simple syntax which is presented in Appendix I.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/92/CS-TR-68-92.pdf

%R CS-TR-68-95
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A formal syntax for transformational grammar
%A Friedman, Joyce
%A Doran, Robert W.
%D March 1968
%X A formal definition of the syntax of a transformational
               grammar is given using a modified Backus Naur Form as the
               metalanguage. Syntax constraints and interpretation are added
               in English. The underlying model is that presented by Chomsky
               in "Aspects of the Theory of Syntax." Definitions are given
               for the basic concepts of tree, analysis, restriction,
               complex symbol, and structural change, as well as for the
               major components of a transformational grammar, phrase
               structure, lexicon, and transformations. The syntax was
               developed as a specification of input formats for the
               computer system for transformational grammar described in
               [Joyce Friedman, "A Computer System for Transformational
               Grammar," Stanford University Computer Science Department
               report CS-84, January 1968]. It includes as a subcase a
               fairly standard treatment of transformational grammar, but
               has been generalized in many respects.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/95/CS-TR-68-95.pdf

%R CS-TR-68-96
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Interval arithmetic determinant evaluation and its use in
               testing for a Chebyshev system
%A Smith, Lyle B.
%D April 1968
%X Two recent papers by Hansen and by Hansen and R. R. Smith
               have shown how interval arithmetic (I.A.) can be used
               effectively to bound errors in matrix computations. This
               paper compares a method proposed by Hansen and R. R. Smith to
               straight-forward use of I.A. in determinant evaluation.
               Computational results show what accuracy and running times
               can be expected when using I.A. for determinant evaluation.
               An application using I.A. determinants in a program to test a
               set of functions to see if they form a Chebyshev system is
               then presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/96/CS-TR-68-96.pdf

%R CS-TR-68-113
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The impact of storage management on plex processing language
               implementation
%A Hansen, Wildred J.
%D July 1969
%X A plex processing system is implemented within a set of
               environments whose relationships are vital to the system's
               time/space efficiency:
               Data Environment
               Stack Structures
               Data Structures
               Subroutine Environment
               Routine Linkage
               Variable Binding
               Storage Management Environment
               Memory Organization for Allocation
               Storage Control
               This paper discusses these environments and their
               relationships in detail. For each environment there is some
               discussion of alternative implementation techniques, the
               dependence of the implementation on the hardware, and the
               dependence of the environment on the language design. In
               particular, two language features are shown to affect
               substantially the environment design: variable length plexes
               and 'release' of active plexes. Storage management is
               complicated by the requirement for variable length plexes,
               but they can substantially reduce memory requirements. If
               inactive plexes are released, a garbage collector can be
               avoided; but considerable tedious programming may be required
               to maintain the status of each plex.
               Many plex processing systems store numbers in strange formats
               and compile arithmetic operations as subroutine calls, thus
               handicapping the computer on the only operations it does
               well. Careful coordination of the system environments can
               permit direct numeric computation, that is, a single
               instruction for each arithmetic operation. This paper
               considers with each environment, the requirements for direct
               numeric computation.
               To explore the techniques discussed, a collection of
               environments called Swym was implemented. This system permits
               variable length plexes and compact lists. The latter is a
               list representation requiring less space than chained lists
               because pointers to the elements are stored in consecutive
               words. In Swym, a list can be partly compact and partly
               chained. The garbage collector converts chained lists into
               compact lists when possible. Swym has careful provision for
               direct numeric computation, but no compiler has been built.
               To illustrate Swym, an interpreter was implemented for a
               small language similar to LISP 1.5. Details of Swym and the
               langauge are in a series of appendices.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/113/CS-TR-68-113.pdf

%R CS-TR-68-115
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Programmers manual for a computer system for transformational
               grammar
%A Friedman, Joyce
%A Bredt, Thomas H.
%A Doran, Robert W.
%A Martner, Theodore S.
%A Pollack, Bary W.
%D August 1968
%X This volume provides programming notes on a computer system
               for transformational grammar. The important ideas of the
               system have been presented in a series of reports which are
               listed in Appendix B; this document is the description of the
               system as a program. It is intended for programmers who might
               wish to maintain, modify or extend the system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/115/CS-TR-68-115.pdf

%R CS-TR-68-102
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Integer programming over a cone
%A Pnueli, Amir
%D July 1968
%X The properties of a special form integer programming problem
               are discussed. We restrict ourselves to optimization over a
               cone (a set of n constraints in n unconstrained variables)
               with a square matrix of positive diagonal and non positive
               off-diagonal elements. (Called a bounding form by F. Glover
               [1964]).
               It is shown that a simple iterational process gives the
               optimal integer solution in a finite number of steps.
               It is then shown that any cone problem with bounded rational
               solution can be transformed to the bounding form and hence
               solved by the outlined method.
               Some extensions to more than n constraints are discussed and
               a numerical example is shown to solve a bigger problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/102/CS-TR-68-102.pdf

%R CS-TR-68-103
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Lexical insertion in transformational grammar
%A Friedman, Joyce
%A Bredt, Thomas H.
%D June 1968
%X In this paper, we describe the lexical insertion process for
               generative transformational grammars. We also give detailed
               descriptions of many of the concepts in transformational
               theory. These include the notions of complex symbol,
               syntactic feature (particularly contextual feature),
               redundancy rule, tests for pairs of complex symbols, and
               change operations that may be applied to complex symbols.
               Because of our general interpretation of redundancy rules, we
               define a new complex symbol test known as compatibility. This
               test replaces the old notion of nondistinctness. The form of
               a lexicon suitable for use with a generative grammar is
               specified.
               In lexical insertion, vocabulary words and associated complex
               symbols are selected from a lexicon and inserted at lexical
               category nodes in the tree. Complex symbols are lists of
               syntactic features. The compatibility of a pair of complex
               symbols and the analysis procedure used for contextual
               features are basic in determining suitable items for
               insertion. Contextual features (subcategorization and
               selectional) have much in common with the structural
               description for a transformation and we use the same analysis
               procedure for both. A problem encountered in the insertion of
               a complex symbol that contains selectional features is side
               effects. We define the notion of side effects and describe
               how these effects are to be treated.
               The development of the structure of the lexicon and the
               lexical insertion algorithm has been aided by a system of
               computer programs that enable the linguist to study
               transformational grammar. In the course of this development,
               a computer program to perform lexical insertion was written.
               Results obtained using this program with fragments of
               transformational grammar are presented. The paper concludes
               with suggestions for extensions of this work and a discussion
               of interpretations of transformational theory that do not fit
               immediately into our framework.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/103/CS-TR-68-103.pdf

%R CS-TR-68-107
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A three-stage variable-shift iteration for polynomial zeros
               and its relation to generalized Rayleigh iteration
%A Jenkins, M. A.
%A Traub, Joseph F.
%D August 1968
%X We introduce a new three-stage process for calculating the
               zeros of a polynomial with complex coefficients. The
               algorithm is similar in spirit to the two-stage algorithms
               studied by Traub in a series of papers. The algorithm is
               restriction free, that is, it converges for any distribution
               of zeros. A proof of global convergence is given.
               Z eros are calculated in roughly increasing order of magnitude
               to avoid deflation instability. Shifting is incorporated in a
               natural and stable way to break equimodularity and speed
               convergence. The three stages use no shift, a fixed shift,
               and a variable shift, respectively.
               To obtain additional insight we recast the problem and
               algorithm into matrix form. The third stage is inverse
               iteration with the companion matrix, followed by generalized
               Rayleigh iteration.
               A program implementing the algorithm was written in a dialect
               of ALGOL 60 and run on Stanford University's IBM 360/67. The
               program has been extensively tested and testing is
               continuing. For polynomials with complex coefficients and of
               degrees ranging from 20 to 50, the time required to calculate
               all zeros averages $8n^2$ milliseconds.
               Timing information and a numerical example are provided. A
               description of the implementation, an analysis of the effects
               of finite-precision arithmetic, an ALGOL 60 program, the
               results of extensive testing, and a second program which
               clusters the zeros and provides a posteriori error bounds
               will appear elsewhere.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/107/CS-TR-68-107.pdf

%R CS-TR-68-109
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A computer system for writing and testing transformational
               grammars: final report
%A Friedman, Joyce
%D September 1968
%X A comprehensive system for transformational grammar has been
               designed and is being implemented on the IBM 360/67 computer.
               The system deals with the transformational model of syntax,
               along the lines of Chomsky's "Aspects of the Theory of
               Syntax." The major innovations include a full and formal
               description of the syntax of a transformational grammar, a
               directed random phrase structure generator, a lexical
               insertion algorithm, and a simple problem-oriented
               programming language in which the algorithm for application
               of transformations can be expressed. In this paper we present
               the system as a whole, first discussing the philosophy
               underlying the development of the system, then outlining the
               system and discussing its more important special features.
               References are given to papers which consider particular
               aspects of the system in detail.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/109/CS-TR-68-109.pdf

%R CS-TR-68-111
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Analysis in transformational grammar
%A Friedman, Joyce
%A Martner, Theodore S.
%D August 1968
%X In generating sentences by means of a transformational
               grammar, it is necessary to analyze trees, testing for the
               presence or absence of various structures. This analysis
               occurs at two stages in the generation process -- during
               insertion of lexical items (more precisely, in testing
               contextual features), and during the transformation process,
               when individual transformations are being tested for
               applicability.
               In this paper we describe a formal system for the definition
               of tree structure of sentences. The system consists of a
               formal language for partial or complete definition of the
               tree structure of a sentence, plus an algorithm for
               comparison of such a definition with a tree. It represents a
               significant generalization of Chomsky's notion of "proper
               analysis", and is flexible enough to be used within any
               transformational grammar which we have seen.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/111/CS-TR-68-111.pdf

%R CS-TR-68-112
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A control language for transformational grammar
%A Friedman, Joyce
%A Pollack, Bary W.
%D August 1968
%X Various orders of application of transformations have been
               considered in transformational grammar, ranging from unorder
               to cyclical orders involving notions of "lowest sentence" and
               of numerical indices on depth of embedding. The general
               theory of transformational grammar does not yet offer a
               uniform set of "traffic rules" which are accepted by most
               linguists. Thus, in designing a model of transformational
               grammar, it seems advisable to allow the specification of the
               order and point of application of transformations to be a
               proper part of the grammar.
               In this paper we present a simple control language designed
               to be used by linguists for this specification.
               In the control language the user has the ability to:
               1. Group transformations into ordered sets and apply
               transformations either individually or by transformation set.
               2. Specify the order in which the transformation sets are to
               be considered.
               3. Specify the subtrees in which a transformation set is to
               be applied.
               4. Allow the order of application to depend on which
               transformations have previously modified the tree.
               5. Apply a transformation set either once or repeatedly.
               In addition, since the control language has been implemented
               as part of a computer system, the behavior of the
               transformations may be monitored giving additional
               information on their operation.
               In this paper we present the control language and examples of
               its use. Discussion of the computer implementation will be
               found in [Pollack, B.W. The Control Program and Associated
               Subroutines. Stanford University. Computer Science
               Department. Computational Linguistics Project. Report no.
               AF-28. June 1968.].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/112/CS-TR-68-112.pdf

%R CS-TR-68-110
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T ALGOL W (revised)
%A Bauer, Henry R.
%A Becker, Sheldon I.
%A Graham, Susan L.
%A Floyd, Robert W.
%A Forsythe, George E.
%A Satterthwaite, Edwin H.
%D September 1969
%X "A Contribution to the Development of ALGOL" by Niklaus Wirth
               and C. A. R. Hoare [Comm. ACM, v.9, no. 6 (June 1966), pp.
               413-431] was the basis for a compiler developed for the IBM
               360 at Stanford University. This report is a description of
               the implemented language, ALGOL W. Historical background and
               the goals of the language may be found in the Wirth and Hoare
               paper.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/110/CS-TR-68-110.pdf

%R CS-TR-68-114
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Calgen - an interactive picture calculus generation system
%A George, James E.
%D December 1968
%X A sub-set of the Picture Calculus was implemented on the IBM
               360/75 to experiment with the proposed data structure, to
               study the capability of PL/1 for implementing the Picture
               Calculus and to evaluate the usefulness of drawing pictures
               with this formalized language. The system implemented is
               referred to as Calgen.
               Like many other drawing proggrams, Calgen utilizes a graphic
               display console; however, it differs from previous drawing
               systems in one major area, namely, Calgen retains structure
               information. Since the Picture Calculus is highly structured,
               Calgen retains structure information, and only scope images
               where convenient; further, these scope images saved may be
               altered by changing the structure information. The only
               reason scope images are saved by Calgen is to avoid
               regeneration of a previously generated picture.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/114/CS-TR-68-114.pdf

%R CS-TR-68-119
%Z Wed, 20 Dec 95 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T MPL: Mathematical Programming Language
%A Bayer, Rudolf
%A Bigelow, James H.
%A Dantzig, George B.
%A Gries, David J.
%A McGrath, Michael B.
%A Pinsky, Paul D.
%A Schuck, Stephen K.
%A Witzgall, Christoph
%D May 1968
%X The purpose of MPL is to provide a language for writing
               mathematical programming algorithms that will be easier to
               write, to read, and to modify than those written in currently
               available computer languages. It is believed that the
               writing, testing, and modification of codes for solving
               large-scale linear programs will be a less formidable
               undertaking once MPL becomes available. It is hoped that by
               the Fall of 1968, work on a compiler for MPL will be well
               underway.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/119/CS-TR-68-119.pdf

%R CS-TR-67-54
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A generalized Bairstow algorithm
%A Golub, Gene H.
%A Robertson, Thomas N.
%D January 1967
%X This report discusses convergence and applications for the
               generalized Bairstow algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/54/CS-TR-67-54.pdf

%R CS-TR-67-55
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A stopping criterion for polynomial root finding
%A Adams, Duane A.
%D February 1967
%X When solving for the roots of a polynomial, it is generally
               difficult to know just when to terminate the iteration
               process. In this paper an algorithm is derived and discussed
               which allows one to terminate the iteration process on the
               basis of calculated bounds for the roundoff error.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/55/CS-TR-67-55.pdf

%R CS-TR-67-56
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T QD-method with Newton shift
%A Bauer, Friedrich L.
%D March 1967
%X Theoretically, for symmetric matrices, a QR-step is
               equivalent to two successive LR-steps, and the
               LR-transformation for a tridiagonal matrix is, apart from
               organizational details, identical with the qd-method. For
               non-positive definite matrices, however, the
               LR-transformation cannot be guaranteed to be numerically
               stable unless pivotal interchanges are made. This has led to
               preference for the QR-transformation, which is always
               numerically stable. If, however, some of the smallest or some
               of the largest eigenvalues are wanted, then the
               QR-transformation will not necessarily give only these, and
               bisection might seem too slow with its fixed convergence rate
               of 1/2. In this situation, Newton's method would be fine if
               the Newton correction can be computed sufficiently simply,
               since it will always tend monotonically to the nearest root
               starting from a point outside the spectrum. Consequently, if
               one always worked with positive (or negative) definite
               matrices, there would be no objection to using the now stable
               qd-algorithm. The report shows that for a qd-algorithm, the
               Newton correction can very easily be calculated, and
               accordingly a shift which avoids under-shooting, or a lower
               bound. Since the last diagonal element gives an upper bound,
               the situation is quite satisfactory with respect to bounds.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/56/CS-TR-67-56.pdf

%R CS-TR-67-57
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The use of transition matrices in compiling
%A Gries, David
%D March 1967
%X The construction of efficient parsing algorithms for
               programming languages has been the subject of many papers in
               the last few years. Techniques for efficient parsing and
               algorithms which generate the parser from a grammar or phrase
               structure system have been derived. Some of the well-known
               methods are the precedence techniques of Floyd, and Wirth and
               Weber, and the production langauge of Feldman. Perhaps the
               first such discussion was by Samelson and Bauer. There the
               concept of the push-down stack was introduced, along with the
               idea of a transition matrix. A transition matrix is just a
               switching table which lets one determine from the top element
               of the stack (denoting a row of the table) and the next
               symbol of the program to be processed (represented by a
               column of the table) exactly what should be done. Either a
               reduction is made in the stack, or the incoming symbol is
               pushed onto the stack. Considering its efficiency, the
               transition matrix technique does not seem to have achieved
               much attention, probably because it was not sufficiently
               well-defined. The purpose of this paper is to define the
               concept more formally, to illustrate that the technique is
               very efficient, and to describe an algorithm which generates
               a transition matrix from a suitable grammar. The report also
               describes other uses of transition matrices besides the usual
               ones of syntax checking and compiling.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/57/CS-TR-67-57.pdf

%R CS-TR-67-59
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Almost diagonal matrices with multiple or close eigenvalues
%A Wilkinson, James H.
%D April 1967
%X If A = D + E where D is the matrix of diagonal elements of A,
               then when A has some multiple or very close eigenvalues, E
               has certain characteristic properties. These properties are
               considered both for hermitian and non-hermitian A. The
               properties are important in connexion with several algorithms
               for diagonalizing matrices by similarity transformations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/59/CS-TR-67-59.pdf

%R CS-TR-67-60
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Two algorithms based on successive linear interpolation
%A Wilkinson, James H.
%D April 1967
%X The method of successive linear interpolation has a very
               satisfactory asymptotic rate of convergence but the behavior
               in the early steps may lead to divergence. The regular falsi
               has the advantage of being safe but its asymptotic behavior
               is unsatisfactory. Two modified algorithms are described here
               which overcome these weaknesses. Although neither is new,
               discussions of their main features do not appear to be
               readily available in the literature.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/60/CS-TR-67-60.pdf

%R CS-TR-67-61
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the asymptotic directions of the s-dimensional optimum
               gradient method
%A Forsythe, George E.
%D April 1967
%X The optimum s-gradient method for minimizing a positive
               definite quadratic function f(x) on $E_n$ has long been known
               to converge for s $\geq$ 1. For these $\underline{s}$ the
               author studies the directions from which the iterates $x_k$
               approach their limit, and extends to s > 1 a theory proved by
               Akaike for s = 1. It is shown that f($x_k$) can never
               converge to its minimum value faster than linearly, except in
               degenerate cases where it attains the minimum in one step.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/61/CS-TR-67-61.pdf

%R CS-TR-67-62
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Varying length floating point arithmetic: a necessary tool
               for the numerical analyst
%A Tienari, Martti
%D April 1967
%X The traditional floating point arithmetic of scientific
               computers is biased towards fast and easy production of
               numerical results without enough provision to enable the
               programmer to control and solve problems connected with
               numerical accuracy and cumulative round-off errors. The
               author suggests the varying length floating point arithmetic
               as a general purpose solution for most of these problems.
               Some general philosophies are outlined for applications of
               this feature in numerical analysis. The idea is analyzed
               further discussing hardware and software implementations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/62/CS-TR-67-62.pdf

%R CS-TR-67-63
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Graeffe's method for eigenvalues
%A Polya, George
%D April 1967
%X Let an entire function F(z) of finite genus have infinitely
               many zeros which are all positive, and take real values for
               real z. Then it is shown how to give two-sided bounds for all
               the zeros of F in terms of the coefficients of the power
               series of F, and of coefficients obtained by Graeffe's
               algorithm applied to F. A simple numerical illustration is
               given for a Bessel function.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/63/CS-TR-67-63.pdf

%R CS-TR-67-64
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Floating-point number representations: base choice versus
               exponent range
%A Richman, Paul L.
%D April 1967
%X A digital computer whose memory words are composed of r-state
               devices is considered. The choice of the base, $\Beta$, for
               the internal floating-point numbers on such a computer is
               discussed. Larger values of $\Beta$ necessitate the use of
               more r-state devices for the mantissa, in order to preserve
               some "minimum accuracy," leaving fewer r-state devices for
               the exponent of $\Beta$. As $\Beta$ increases, the exponent
               range may increase for a short period, but it must ultimately
               decrease to zero. Of course, this behavior depends on what
               definition of accuracy is used. This behavior is analyzed for
               a recently proposed definition of accuracy which specifies
               when it is to be said that the set of q-digit base $\Beta$
               floating-point numbers is accurate to p-digits base t. The
               only case of practical importance today is t=10 and r=2; and
               in this case we find that $\Beta$ = 2 is always best. However
               the analysis is done to cover all cases.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/64/CS-TR-67-64.pdf

%R CS-TR-67-65
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On certain basic concepts of programming languages
%A Wirth, Niklaus
%D May 1967
%X Recent developments of programming languages have led to the
               emergence of languages whose growth showed cancerous
               symptoms: the proliferation of new elements defied every
               control exercised by the designers, and the nature of the new
               cells often proved to be incompatible with the existing body.
               In order that a language be free from such symptoms, it is
               necessary that it be built upon basic concepts which are
               sound and mutually independent. The rules governing the
               language must be simple, generally applicable and consistent.
               In order that simplicity and consistency can be achieved, the
               fundamental concepts of a language must be well-chosen and
               defined with utmost clarity.
               In practice, it turns out that there exists an optimum in the
               number of basic concepts, below which not only
               implementability of these concepts on actual computers, but
               also their appeal to human intuition becomes questionable
               because of their high degree of generalization. These
               informal notes do not abound with ready-made solutions, but
               it is hoped they shed some light on several related subjects
               and inherent difficulties. They are intended to summarize and
               interrelate various ideas which are partly present in
               existing languages, partly debated within the IFIP Working
               Group 2.1, and partly new.
               While emphasis is put on clarification of conceptual issues,
               consideration of notation cannot be ignored. However, no
               formal or concise definitions of notation (syntax) will be
               given or used; the concepts will instead be illustrated by
               examples, using notation based on Algol as far as possible.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/65/CS-TR-67-65.pdf

%R CS-TR-67-67
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computational considerations regarding the calculation of
               Chebyshev solutions for overdetermined linear eqauation
               systems by the exchange method
%A Bartels, Richard H.
%A Golub, Gene H.
%D June 1967
%X An implementation, using Gaussian LU decomposition with row
               interchanges, of Stiefel's exchange algorithm for determining
               a Chebyshev solution to an overdetermined system of linear
               equations is presented. The implementation is computationally
               more stable than those usually given in the literature. A
               generalization of Stiefel's algorithm is developed which
               permits the occasional exchange of two equations
               simultaneously. Finally, some experimental comparisons are
               offered.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/67/CS-TR-67-67.pdf

%R CS-TR-67-69
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Translator writing systems
%A Feldman, Jerome A.
%A Gries, David
%D June 1967
%X Compiler writing has long been a glamour field within
               programming and has a well developed folklore. More recently,
               the attention of researchers has been directed toward various
               schemes for automating different parts of the compiler
               writer's task. This paper contains neither a history of nor
               an introduction to these developments; the references at the
               end of this section provide what introductory material there
               is in the literature. Although we will make comparisons
               between individual systems and between various techniques,
               this is certainly not a consumer's guide to translator
               writing systems. Our intended purpose is to carefully
               consider the existing work in an attempt to form a unified
               scientific basis for future research.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/69/CS-TR-67-69.pdf

%R CS-TR-67-70
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On computation of flow patterns of compressible fluids in the
               transonic region
%A Bergman, Stefan
%A Herriot, John G.
%A Richman, Paul L.
%D July 1967
%X The first task in devising a numerical procedure for solving
               a given problem is that of finding a constructive
               mathematical solution to the problem. But even after such a
               solution is found there is much to be done. Mathematical
               solutions normally involve infinite processes such as
               integration and differentiation as well as infinitely precise
               arithmetic and functions defined in arbitrarily involved
               ways. Numerical procedures suitable for a computer can
               involve only finite processes, fixed or at least bounded
               length arithmetic and rational functions. Thus one must find
               efficient methods which yield approximate solutions.
               Of interest here are the initial and boundary value problems
               for compressible fluid flow. Constructive solutions to these
               problems can be found in [Bergman, S., "On representation of
               stream functions of subsonic and supersonic flows of
               compressible fluids," Journal of Rational Mechanics and
               Analysis, v.4 (1955), no. 6, pp. 883-905]. As presented
               there, solution of the boundary value problem is limited to
               the subsonic region, and is given symbolically as a linear
               combination of orthogonal functions. A numerical continuation
               of this (subsonic) solution into the supersonic region can be
               done by using the (subsonic) solution and its derivative to
               set up an intial value problem. The solution to the initial
               value problem may then be valid in (some part of) the
               supersonic region. Whether this continuation will lead to a
               closed, meaningful flow is an open question. In this paper,
               we deal with the numerical solution of the initial value
               problem. We are currently working on the rest of the
               procedure described above.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/70/CS-TR-67-70.pdf

%R CS-TR-67-75
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Theory of norms
%A Bauer, Friedrich L.
%D August 1967
%X These notes are based on lectures given during the winter of
               1967 as CS 233, Computer Science Department, Stanford
               University.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/75/CS-TR-67-75.pdf

%R CS-TR-67-68
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The PL360 system
%A Wirth, Niklaus
%D June 1967
%X This report describes the use and the organization of the
               operating system which serves as the environment of the PL360
               language defined in the companion report, CS 53 [Niklaus
               Wirth, "A Programming Language for the 360 Computers,"
               Stanford University Department of Computer Science, June
               1967]. Edited by Niklaus Wirth.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/68/CS-TR-67-68.pdf

%R CS-TR-67-72
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Chebyshev approximation of continuous functions by a
               Chebyshev system of functions
%A Golub, Gene H.
%A Smith, Lyle B.
%D July 1967
%X The second algorithm of Remez can be used to compute the
               minimax approximation to a function, f(x), by a linear
               combination of functions, ${\{Q_i (x)\}}^{N}_{O}$, which form
               a Chebyshev system. The only restriction on the function to
               be approximated is that it be continuous on a finite interval
               [a,b]. An Algol 60 procedure is given which will accomplish
               the approximation. This implementation of the second
               algorithm of Remez is quite general in that the continuity of
               f(x) is all that is required whereas previous implementations
               have required differentiability, that the end points of the
               interval be "critical points," and that the number of
               "critical points" be exactly N+2. Discussion of the method
               used and its numerical properties is given as well as some
               computational examples of the use of the algorithm. The use
               of orthogonal polynomials (which change at each iteration) as
               the Chebyshev system is also discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/72/CS-TR-67-72.pdf

%R CS-TR-67-76
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Collectively compact operator approximations.
%A Anselone, Phillip M.
%D September 1967
%X This report consists of notes based on lectures presented
               July-August 1967. The notes were prepared by Lyle Smith.
               A general approximation theory for linear and nonlinear
               operators on Banach spaces is presented. It is applied to
               numerical integration approximations of integral operators.
               Convergence of the operator approximations is pointwise
               rather than uniform on bounded sets, which is assumed in
               other theories. The operator perturbations form a
               collectively compact set, i.e., they map each bounded set
               into a single compact set. In the nonlinear case, Frechet
               differentiability conditions are also imposed. Principal
               results include convergence and error bounds for approximate
               solutions and, for linear operators, results on spectral
               approximations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/76/CS-TR-67-76.pdf

%R CS-TR-67-77
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T What to do till the computer scientist comes
%A Forsythe, George E.
%D September 1967
%X The potential impact of computer science departments in the
               field of education is discussed. This is an expanded version
               of a presentation to a panel session before the Mathematics
               Association of America, Toronto, 30 August 1967.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/77/CS-TR-67-77.pdf

%R CS-TR-67-78
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Machine utilization of the natural language word 'good'
%A Colby, Kenneth Mark
%A Enea, Horace J.
%D September 1967
%X Using the term 'good' as an example, the effect of natural
               language input on an interviewing computer program is
               described. The program utilizes syntactic and semantic
               information to generate relevant plausible inferences from
               which statements for a goal-directed man-machine dialogue can
               be constructed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/78/CS-TR-67-78.pdf

%R CS-TR-67-79
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T 360 O.S. FORTRAN IV free field input/output subroutine
               package
%A Doran, Robert W.
%D October 1967
%X Programmers dealing with aspects of natural language
               processing have a difficult task in choosing a computer
               language which enables them to program easily, produce
               efficient code and accept as data freely written sentences
               with words of arbitrary length. List processing languages
               such as LISP are reasonably easy to program in but do not
               execute very quickly. Other, formula oriented, languages like
               FORTRAN are not provided with free field input.
               The Computational Linguistics group at the Stanford
               University Computer Science Department is writing a system
               for testing transformational grammars. As these grammars are
               generally large and complicated, it is important to make the
               system as efficient as possible, so we are using FORTRAN IV
               (O.S. on IBM 360-65) as our language. To enable us to handle
               free field input we have developed a subroutine package which
               we describe here in the hope that it will be useful to others
               embarking on natural language tasks.
               The package consists of two main programs, free field reader,
               free field writer, with a number of utility routines and
               constant COMMON blocks.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/79/CS-TR-67-79.pdf

%R CS-TR-67-80
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Directed random generation of sentences
%A Friedman, Joyce
%D October 1967
%X The problem of producing sentences of a transformational
               grammar by using a random generator to create phrase
               structure trees for input to the lexical insertion and
               transformational phases is discussed. A purely random
               generator will produce base trees which will be blocked by
               the transformations, and which are frequently too long to be
               of practical interest. A solution is offered in the form of a
               computer program which allows the user to constrain and
               direct the generation by the simple but powerful device of
               restricted subtrees. The program is a directed random
               generator which accepts as input a subtree with restrictions
               and produces around it a tree which satisfies the
               restrictions and is ready for the next phase of the grammar.
               The underlying linguistic model is that of Noam Chomsky, as
               presented in "Aspects of the Theory of Syntax." The program
               is written in Fortran IV for the IBM 360/67 and is part of
               the Stanford Transformational Grammar Testing System. It is
               currently being used with several partial grammars of
               English.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/80/CS-TR-67-80.pdf

%R CS-TR-67-81
%Z Wed, 03 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Calculation of Gauss quadrature rules
%A Golub, Gene H.
%A Welsch, John H.
%D November 1967
%X Most numerical integration techniques consist of
               approximating the integrand by a polynomial in a region or
               regions and then integrating the polynomial exactly. Often a
               complicated integrand can be factored into a non-negative
               'weight' function and another function better approximated by
               a polynomial, thus
               $\int_{a}^{b} g(t)dt = \int_{a}^{b} \omega (t)f(t)dt \approx
               \sum_{i=1}^{N} w_i f(t_i)$.
               Hopefully, the quadrature rule ${\{w_j, t_j\}}_{j=1}^{N}$
               corresponding to the weight function $\omega$(t) is available
               in tabulated form, but more likely it is not. We present here
               two algorithms for generating the Gaussian quadrature rule
               defined by the weight function when:
               a) the three term recurrence relation is known for the
               orthogonal polynomials generated by $\omega$(t), and
               b) the moments of the weight function are known or can be
               calculated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/67/81/CS-TR-67-81.pdf

%R CS-TR-66-34
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Eigenvectors of a real matrix by inverse iteration
%A Varah, James M.
%D February 1966
%X This report contains the description and listing of an ALGOL
               60 program which calculates the eigenvectors of an arbitrary
               real matrix, using the technique of inverse iteration.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/34/CS-TR-66-34.pdf

%R CS-TR-66-37
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T COGENT 1.2 operations manual
%A Reynolds, John C.
%D April 1966
%X This document is an addendum to the COGENT Programming Manual
               (Argonne National Laboratory, ANL-7022, March 1965, hereafter
               referred to as CPM) which describes a specific implementation
               of the COGENT system, COGENT 1.2, written for the Control
               Data 3600 Computer.
               Chapters I and II describe a variety of features available in
               COGENT 1.2 which are not mentioned in CPM; these chapters
               parallel the material in Chapters II and III of CPM. Chapter
               III of this report gives various operational details
               concerning the assembly and loading of both COGENT-compiled
               programs and the compiler itself. Chapter IV describes system
               and error messages.
               Familiarity with the contents of CPM is assumed throughout
               this report. In addition, a knowledge of the 3600 operating
               system SCOPE, and the assembler COMPASS is assumed in Chapter
               III.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/37/CS-TR-66-37.pdf

%R CS-TR-66-39
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A university's educational program in computer science
%A Forsythe, George E.
%D May 1966
%X After a review of the power of contemporary computers,
               computer science is defined in several ways. The objectives
               of computer science education are stated, and it is asserted
               that in a U.S. university these will be achieved only through
               a computer science department. The program at Stanford
               University is reviewed as an example. The appendix includes
               syllabi of Ph.D. qualifying examinations for Stanford's
               Computer Science Department.
               This is a revision of a previous Stanford Computer Science
               Department report, CS 26.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/39/CS-TR-66-39.pdf

%R CS-TR-66-40
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T How do you solve a quadratic equation?
%A Forsythe, George E.
%D June 1966
%X The nature of the floating-point number system of digital
               computers is explained to a reader whose university
               mathematical background is very limited. The possibly large
               errors in using mathematical algorithms blindly with
               floating-point computation are illustrated by the formula for
               solving a quadratic equation. An accurate way of solving a
               quadratic is outlined. A few general remarks are made about
               computational mathematics, including the backwards analysis
               of rounding error.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/40/CS-TR-66-40.pdf

%R CS-TR-66-41
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Accurate eigenvalues of a symmetric tri-diagonal matrix
%A Kahan, William
%D July 1966
%X Having established tight bounds for the quotient of two
               different lub-norms of the same tri-diagonal matrix J, the
               author observes that these bounds could be of use in an
               error-analysis provided a suitable algorithm were found. Such
               an algorithm is exhibited, and its errors are thoroughly
               accounted for, including the effects of scaling,
               over/underflow and roundoff. A typical result is that, on a
               computer using rounded floating point binary arithmetic, the
               biggest eigenvalue of J can be computed easily to within 2.5
               units in its last place, and the smaller eigenvalues will
               suffer absolute errors which are no larger. These results are
               somewhat stronger than had been known before.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/41/CS-TR-66-41.pdf

%R CS-TR-66-42
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T When to neglect off-diagonal elements of symmetric
               tri-diagonal matrices
%A Kahan, William
%D July 1966
%X Given a tolerance $\epsilon$ > 0, we seek a criterion by
               which an off-diagonal element of the symmetric tri-diagonal
               matrix J may be deleted without changing any eigenvalue of J
               by more than $\epsilon$. The criterion obtained here permits
               the deletion of elements of order $\sqrt{\epsilon }$ under
               favorable circumstances, without requiring any prior
               knowledge about the separation between the eigenvalues of J.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/42/CS-TR-66-42.pdf

%R CS-TR-66-43
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Two working algorithms for the eigenvalues of a symmetric
               tridiagonal matrix
%A Kahan, William
%A Varah, James M.
%D August 1966
%X Two tested programs are supplied to find the eigenvalues of a
               symmetric tridiagonal matrix. One program uses a
               square-root-free version of the QR algorithm. The other uses
               a compact kind of Sturm sequence algorithm. These programs
               are faster and more accurate than the other comparable
               programs published previously with which they have been
               compared.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/43/CS-TR-66-43.pdf

%R CS-TR-66-44
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Relaxation methods for an eigenproblem
%A Kahan, William
%D August 1966
%X A theory is developed to account for the convergence
               properties of certain relaxation iterations which have been
               widely used to solve the eigenproblem
               $(A - \lambda B) \underline{x} = 0, \underline{x} \neq 0,
               with large symmetric matrices A and B and positive definite
               B. These iterations always converge, and almost always
               converge to the right answer. Asymptotically, the theory is
               essentially that of the relaxation iteration applied to a
               semi-definite linear system discussed in the author's
               previous report [Stanford University Computer Science
               Department report CS45, 1966].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/44/CS-TR-66-44.pdf

%R CS-TR-66-45
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Relaxation methods for semi-definite systems
%A Kahan, William
%D August 1966
%X Certain non-stationary relaxation iterations, which are
               commonly applied to positive definite symmetric systems of
               linear equations, are also applicable to a semi-definite
               system provided that system is consistent. Some of the
               convergence theory of the former application is herein
               extended to the latter application. The effects of rounding
               errors and of inconsistency are discussed too, but with few
               helpful conclusions. Finally, the application of these
               relaxation iterations to an indefinite system is shown here
               to be ill-advised because these iterations will almost
               certainly diverge exponentially.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/45/CS-TR-66-45.pdf

%R CS-TR-66-47
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An interpreter for "Iverson notation"
%A Abrams, Philip S.
%D August 1966
%X Kenneth E. Iverson's book, "A Programming Language" [New
               York: Wiley, 1962], presented a highly elegant language for
               the description and analysis of algorithms. Although not
               widely acclaimed at first, "Iverson notation" (referred to as
               "the language" in this report) is coming to be recognized as
               an important tool by computer scientists and programmers.
               The current report contains an up-to-date definition of a
               subset of the language, based on recent work by Iverson and
               his colleagues. Chapter III describes an interpreter for the
               language, written jointly by the author and Lawrence M. Breed
               of IBM. The remainder of the paper consists of critiques of
               the implementation and the language, with suggestions for
               improvement.
               This report was originally submitted in fulfillment of a
               Computer Science 239 project supervised by Professor Niklaus
               Wirth, Stanford University, May 30, 1966.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/47/CS-TR-66-47.pdf

%R CS-TR-66-52
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Lecture notes on a course in systems programming
%A Shaw, Alan C.
%D December 1966
%X These notes are based on the lectures of Professor Niklaus
               Wirth which were given during the winter and spring of
               1965/66 as CS 236a and part of CS 236b, Computer Science
               Department, Stanford University.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/52/CS-TR-66-52.pdf

%R CS-TR-66-53
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming language for the 360 computers
%A Wirth, Niklaus
%D December 1966
%X A programming language for the IBM 360 computers and its
               implementation are described. The language, called PL360,
               provides the facilities of a symbolic machine language, but
               displays a structure defined by a recursive syntax. The
               compiler, consisting of a precedence syntax analyser and a
               set of interpretation rules with strict one-to-one
               correspondence to the set of syntactic rules directly
               reflects the definition of the language.
               | k-th syntax rule | k-th interpretation rule | $S_0 ::= S_1
               S_2 ... S_n$ | $V_0 := f_k (V_1 , V_2 , ... , V_n)$ |
               PL360 was designed to improve the readability of programs
               which must take into account specific characteristics and
               limitations of a particular computer. It represents an
               attempt to further the state of the art of programming by
               encouraging and even forcing the programmer to improve his
               style of exposition and his principles and discipline in
               program organization, and not by merely providing a multitude
               of "new" features and facilities. The language is therefore
               particularly well suited for tutorial purposes.
               The attempt to present a computer as a systematically
               organized entity is also hoped to be of interest to designers
               of future computers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/66/53/CS-TR-66-53.pdf

%R CS-TR-65-16
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Maximizing a second-degree polynomial on the unit sphere
%A Forsythe, George E.
%A Golub, Gene H.
%D February 1965
%X Let A be a hermitian matrix of order n, and b a known vector
               in $C^n$. The problem is to determine which vectors make
               $\Phi (x) = {(x-b)}^H\ A(x-b)$ a maximum or minimum on the
               unit sphere U = {x : $x^H$x = 1}. The problem is reduced to
               the determination of a finite point set, the spectrum of
               (A,b). The theory reduces to the usual theory of hermitian
               forms when b = 0.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/16/CS-TR-65-16.pdf

%R CS-TR-65-17
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automatic grading programs
%A Forsythe, George E.
%A Wirth, Niklaus
%D February 1965
%X The ALGOL grader programs are presented for the computer
               evaluation of student ALGOL programs. One is for a beginner's
               program; it furnishes random data and checks answers. The
               other provides a searching test of the reliability and
               efficiency of a rootfinding procedure. There is a statement
               of the essential properties of a computer system, in order
               that grader programs can be effectively used.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/17/CS-TR-65-17.pdf

%R CS-TR-65-18
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The difference correction method for non-linear two-point
               boundary value problems
%A Pereyra, Victor
%D February 1965
%X The numerical solution of non-linear two-point boundary value
               problems is discussed.
               It is shown that for a certain class of finite difference
               approximations the a posteriori use of a difference
               correction raises the order of the approximation by at least
               two orders.
               The difference correction itself involves only the solution
               of one system of linear equations. If Newton's method is used
               in the early stage, then it is shown that the matrices in
               both processes are identical, which is a useful feature in
               coding the method for an automatic computer. Several
               numerical examples are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/18/CS-TR-65-18.pdf

%R CS-TR-65-23
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Convex polynomial approximation
%A Rudin, Bernard D.
%D June 1965
%X Let f(t) be a continuous function on [0,1], or let it be
               discretely defined on a finite point set in [0,1]. The
               problem is the following: among all polynomials p(t) of
               degree n or less which are convex on [0,1], find one which
               minimizes the functional $\l p(t)-f(t)\l$, where $\l\ \l$ is
               a suitably defined norm (in particular, the $L^p$,
               ${\ell}^p$, and Chebyshev norms).
               The problem is treated by showing it to be a particular case
               of a more general problem: let f be an element of a real
               normed linear space V; let $x_{1}(z),...,x_{k}(z)$ be
               continous functions on a subset S of the Euclidean space
               $E^n$ into V such that for each $z_o$ in S the set
               {$x_{1}(z_{o}),...,x_{k}(z_{o})$} is linearly independent in
               V; let $(y_{1},...,y_{k})$ denote an element of the Euclidean
               space $E^k$ and let H be a subset of $K^k$; then among all
               (y,z) in H $\times$ S, find one which minimizes the
               functional $\l y_1\ x_{1}(z)+ ... +y_{k}x_{k}(z) - f\l$. It
               is shown that solutions to this problem exist when H is
               closed and S is compact. Conditions for uniqueness and
               location of solutions on the boundary of H $\times$ S are
               also given.
               Each polynomial of degree n + 2 or less which is convex on
               [0,1] is shown to be uniquely representable in the form
               $y_{o}+y_{1}t+y_2\ \int\int\ p(z,t)dt^2$, where p(z,t) is a
               certain representation of the polynomials positive on [0,1],
               $y_2\ \geq\ 0$, and z is constrained to lie in a certain
               convex hyperpolyhedron. With this representation, the convex
               polynomial approximation problem can be treated by the theory
               mentioned above. It is reduced to a problem of minimizing a
               functional subject to linear constraints.
               Computation of best least squares convex polynomial
               approximation is illustrated in the continuous and discrete
               cases.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/23/CS-TR-65-23.pdf

%R CS-TR-65-25
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Yield-point load determination by nonlinear programming
%A Hodge, Philip G. Jr.
%D June 1965
%X The determination of the yield-point load of a perfectly
               plastic structure can be formulated as a nonlinear
               programming problem by means of the theorems of limit
               analysis. This formulation is discussed in general terms and
               then applied to the problem of a curved beam. Recent results
               in the theory of nonlinear programming are called upon to
               solve typical problems for straight and curved beams. The
               theory of limit analysis enables intermediate answers to be
               given a physical interpretation in terms of upper and lower
               bounds on the yield-point load. The paper closes with some
               indication of how the method may be generalized to more
               complex problems of plastic yield-point load determination.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/25/CS-TR-65-25.pdf

%R CS-TR-65-26
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Stanford University's Program in Computer Science
%A Forsythe, George E.
%D June 1965
%X This report discusses the nature and objectives of Stanford
               University's Program in Computer Science. Listings of course
               offerings and syllabi for Ph.D. examinations are given in
               appendices.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/26/CS-TR-65-26.pdf

%R CS-TR-65-28
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Matrix theorems for partial differential and difference
               equations
%A Miller, John J. H.
%A Strang, Gilbert
%D July 1965
%X We extend the work of Kreiss and Morton to prove: for some
               constant K(m), where m is the order of the matrix A,
               $|A^(n)v| \leq C(v)$ for all n $geq$ 0 and |v| = 1 implies
               that $|{SAS}^{-1}| \leq 1$ for some S with $|S^{-1}| \leq 1$,
               |Sv| $\leq$ k(m)C(v). We establish the analogue for
               exponentials $e^{Pt}$, and use it to construct the minimal
               Hilbert norm dominating $L_2$ in which a given partial
               differential equation with constant coefficients is
               well-posed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/28/CS-TR-65-28.pdf

%R CS-TR-65-29
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On improving an approximate solution of a functional equation
               by deferred corrections
%A Pereyra, Victor
%D August 1965
%X The improvement of discretization algorithms for the
               approximate solution of nonlinear functional equations is
               considered. Extensions to the method of difference
               corrections by Fox are discussed and some general results are
               proved. Applications to nonlinear boundary problems and
               numerical examples are given in some detail.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/29/CS-TR-65-29.pdf

%R CS-TR-65-31
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the approximation of weak solutions of linear parabolic
               equations by a class of multistep difference methods
%A Raviart, Pierre Arnaud
%D December 1965
%X We consider evolution equations of the form
               (1) du(t)/dt + A(t)u(t) = f(t), $0 \leq\ t \leq\ T$, f given,
               with the initial condition
               (2) u(o) = $u_o$, $u_o$ given,
               where each A(t) is an unbounded linear operator in a Hilbert
               space H, which is in practice an ellilptic partial
               differential operator subject to appropriate boundary
               conditions.
               Let $V_h$ be a Hilbert space which depends on the parameter
               h. Let k be the time-step such that m = $\frac{T}{k}$ is an
               integer. We approximate the solution u of (1), (2) by the
               solution $u_{h,k}$ ($u_{h,k}$ = {$u_{h,k}(rk) \in V_{h}$, r =
               0,1,...,m-1}) of the multistep difference scheme
               (3) $\frac{u_{h,k}(rk) - u_{h,k}((r-1)k)}{k} =
               \sum_{{\ell}=0}^{p} {\gamma}_{\ell} A_{h}((r-{\ell})k)
               u_{h,k}((r-{\ell}k) = \sum_{{\ell}=0}^{p} {\gamma}_{\ell}
               f_{h,k}((r-{\ell})k), r = p,...,m-1$
               (4) $u_{h,k}(o),...,u_{h,k}((p-1)k)$ given,
               where each $A_{h}(rk) is a linear continuous operator from
               $V_h$ into $V_h$, $f_{h,k}(rk)$ (r = 0,1,...,m-1) are given,
               and ${\gamma}_{\ell}({\ell}=0,...,p) are given complex
               numbers.
               Our paper is mainly concerned by the study of the stability
               of the approximation. The methods used here are very closely
               related to those developed in the author's thesis and we
               shall refer to the thesis frequently. In Section 1,2, we
               define the continuous and approximate problems in precise
               terms. In Section 4, we find sufficient conditions for
               $u_{h,k}$ to satisfy some a priori estimates. The definition
               of the stability is given in Section 5 and we use the a
               priori estimates for proving a general stability theorem. In
               Section 6 we prove that the stability conditions may be
               weakened when A(t) is a self-adjoint operator (or when only
               the principal part of A(t) is self-adjoint). We give in
               Section 7 a weak convergence theorem. Section 8 is concerned
               with regularity properties. We apply our abstract analysis to
               a class of parabolic partial differential equations with
               variable coefficients in Section 9.
               Strong convergence theorems can be obtained as in the
               author's thesis (via compactness arguments) or as in the
               thesis of J.P. Aubin. We do not study here the discretization
               error (see author's thesis).
               For the study of the stability of multistep difference
               methods in the case of the Cauchy problem for parabolic
               differential operators, we refer to Kreiss [1959], Widlund
               [1965].
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/31/CS-TR-65-31.pdf

%R CS-TR-65-32
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Minimum multiplication Fourier analysis
%A Hockney, Roger W.
%D December 1965
%X Fourier analysis and synthesis is a frequently used tool in
               applied mathematics but is found to be a time consuming
               process to apply on a digital computer and this fact may
               prevent the practical application of the technique. This
               paper describes an algorithm which uses the symmetries of the
               sine and cosine functions to reduce the number of arithmetic
               operations by a factor between 10 and 30. The algorithm is
               applicable to a finite fourier (or harmonic) analysis on $12
               \bigotimes\ 2^q$ values, where q is any integer $\geq$ 0 and
               is applicable to a variety of end conditions. A complete and
               tested B5000 Algol program known as FOURIER12 is included.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/32/CS-TR-65-32.pdf

%R CS-TR-65-33
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A programming language for the 360 computers
%A Wirth, Niklaus
%D December 1965
%X This paper is a prelimary definition of a programming
               language which is specifically designed for use on IBM 360
               computers, and is therefore appropriately called PL360.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/33/CS-TR-65-33.pdf

%R CS-TR-65-20
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T EULER: a generalization of ALGOL, and its formal definition
%A Wirth, Niklaus
%A Weber, Helmut
%D April 1965
%X A method for defining programming languages is developed
               which introduces a rigorous relationship between structure
               and meaning. The structure of a language is defined by a
               phrase structure syntax, the meaning in terms of the effects
               which the execution of a sequence of interpretation rules
               exerts upon a fixed set of variables, called the Environment.
               There exists a one-to-one correspondence between syntactic
               rules and interpretation rules, and the sequence of executed
               interpretation rules is determined by the sequence of
               corresponding syntactic reductions which constitute a parse.
               The individual interpretation rules are explained in terms of
               an elementary and obvious algorithmic notation. A
               constructive method for evaluating a text is provided, and
               for certain decidable classes of languages their unambiguity
               is proven. As an example, a generalization of ALGOL is
               described in full detail to demonstrate that concepts like
               block-structure, procedures, parameters etc. can be defined
               adequately and precisely by this method.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/20/CS-TR-65-20.pdf

%R CS-TR-65-21
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Vectorcardiographic analysis by digital computer, selected
               results
%A Fisher, Donald D.
%A Groeben, Jobst von der
%A Toole, J. Gerald
%D May 1965
%X Instrumentation, recording devices and digital computers now
               may be combined to obtain detailed statistical measures of
               physiological phenomena. Computers make it possible to study
               several models of a system in depth as well as breadth. This
               report is concerned with methods employed in a detailed
               statistical study of some 600 vectorcardiograms from
               different "normal" individuals which were recorded on analog
               magnetic tape using two different orthogonal lead systems
               (Helm, Frank) giving a total of 1200 cardiograms. A "normal"
               individual is defined as one in which no abnormal heart
               condition was detected by either medical history or physical
               examination. One heartbeat in a train of 15 or 20 was
               selected for digitization. An average of 1.2 seconds worth of
               data was digitized from each of the three vector leads
               simultaneously at a rate of 1000 samples per second for each
               lead giving a total of over ${4.10}^6$ values.
               Statistical models by sex and lead system of the P wave and
               QRS complex (at 1 millisecond intervals) and T wave
               (normalized to 60 points in time) were obtained for 43 age
               groups from age 19 to 61 in rectangular coordinates, polar
               coordinates and ellipsoidal fit (F-test) coordinates. Several
               programs were written to perform the analyses on an IBM 7090.
               Two of the programs used 300000+ words of disk storage to
               collect the necessary statistics. Various aspects of the
               study are presented in this report.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/65/21/CS-TR-65-21.pdf

%R CS-TR-64-6
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A fast direct solution of Poisson's equation using Fourier
               analysis
%A Hockney, Roger W.
%D April 1964
%X The demand for rapid procedures to solve Poisson's equation
               has lead to the development of a direct method of solution
               involving Fourier analysis which can solve Poisson's equation
               in a square region covered by a 48 x 48 mesh in 0.9 seconds
               on the IBM 7090. This compares favorably with the best
               iterative methods which would require about 10 seconds to
               solve the same problem.
               The method is applicable to rectangular regions with simple
               boundary conditions and the maximum observed error in the
               potential for several random charge distributions is $5
               \times\ 10^{-7}$ of the maximum potential charge in the
               region.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/64/6/CS-TR-64-6.pdf

%R CS-TR-64-9
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The QD-algorithm as a method for finding the roots of a
               polynomial equation when all roots are positive
%A Andersen, Christian
%D June 1964
%X The Quotient-Difference (QD)-scheme, symmetric functions and
               some results from the theory of Hankel determinants are
               treated. Some well known relations expressing the elements of
               the QD-scheme by means of the Hankel determinants are
               presented. The question of convergence of the columns of the
               QD-scheme is treated. An exact expression for $q_{n}^{k}$ is
               developed for the case of different roots. It is proved that
               the columns of the QD-scheme will converge not only in the
               well known case of different roots, but in all cases where
               the roots are positive. A detailed examination of the
               convergence to the smallest root is presented. An exact
               expression for $q_{n}^{N}$ is developed. This expression is
               correct in all cases of multiple positive roots. It is shown
               that the progressive form of the QD-algorithm is only 'mildly
               unstable'. Finally, some ALGOL programs and some results
               obtained by means of these are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/64/9/CS-TR-64-9.pdf

%R CS-TR-64-11
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Elastic-plastic analysis of trusses by the gradient
               projection method
%A Nakamura, Tsuneyoshi
%A Rosen, Judah Ben
%D July 1964
%X The gradient projection method has been applied to the
               problem of obtaining the elastic-plastic response of a
               perfectly plastic ideal truss with several degrees of
               redundancy to several independently varying sets of
               quasi-static loads. It is proved that the minimization of
               stress rate intensity subject to a set of yield inequalities
               is equivalent to the maximization process of the gradient
               projection method. This equivalence proof establishes the
               basis of the computational method. The technique is applied
               to the problem of investigating the possibilities of shake
               down and to limit analysis. A closed convex "safe load
               domain" is defined to represent the load carrying capacity
               characteristics of a truss subjected to various combinations
               of the several sets of loads.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/64/11/CS-TR-64-11.pdf

%R CS-TR-64-12
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Numerical methods for solving linear least squares problems
               (by G. Golub); An Algol procedure for finding linear least
               squares solutions (by Peter Businger)
%A Golub, Gene H.
%A Businger, Peter A.
%D August 1964
%X A common problem in a Computer Laboratory is that of finding
               linear least squares solutions. These problems arise in a
               variety of areas and in a variety of contexts. Linear least
               squares problems are particularly difficult to solve because
               they frequently involve large quantities of data, and they
               are ill-conditioned by their very nature. In this paper, we
               shall consider stable numerical methods for handling these
               problems. Our basic tool is a matrix decomposition based on
               orthogonal Householder transformations.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/64/12/CS-TR-64-12.pdf

%R CS-TR-64-13
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Computation of the pseudoinverse of a matrix of unknown rank
%A Pereyra, Victor
%A Rosen, Judah Ben
%D September 1964
%X A program is described which computes the pseudoinverse, and
               other related quantities, of an m $\times$ n matrix A of
               unknown rank. The program obtains least square solutions to
               singular and/or inconsistent linear systems Ax = B, where m
               $\leq$ n or m > n and the rank of A may be less than
               min(m,n). A complete description of the programs and its use
               is given, including computational experience on a variety of
               problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/64/13/CS-TR-64-13.pdf

%R CS-TR-63-2
%Z Fri, 19 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The solution of large systems of algebraic equations
%A Pavkovich, John M.
%D December 1963
%X The solution of a system of linear algebraic equations using
               a computer is not a difficult problem as long as the
               equations are not ill-conditioned and all of the coefficients
               can be stored in the computer. However, when the number of
               coefficients is so large that supplemental means of storage,
               such as magnetic tape, are required, the problem of solving
               the system in an efficient manner increases considerably.
               This paper describes a method of solution whereby such
               systems of equations can be solved in an efficient manner.
               The problems associated with ill-conditioned systems of
               equations are not discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/63/2/CS-TR-63-2.pdf

%R CS-TR-96-1563
%Z Wed, 21 Feb 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Database Research: Achievements and Opportunities into the
               21st Century
%A Silberschatz, Avi
%A Stonebraker, Michael
%A Ullman, Jeffrey D.
%D February 1996
%X In May, 1995 an NSF workshop on the future of database
               management systems research was convened. This paper reports
               the conclusions of that meeting. Among the most important
               directions for future DBMS research recommended by the panel
               are: support for multimedia objects; managing distributed and
               loosely coupled information, as on the world-wide web;
               supporting new database applications such as data mining and
               warehousing; workflow and other complex
               transaction-management problems, and enhancing the
               ease-of-use of DBMS's for both users and system managers.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1563/CS-TR-96-1563.pdf

%R CS-TR-96-1564
%Z Fri, 01 Mar 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Medical Applications of Neural Networks: Connectionist Models
               of Survival
%A Ohno-Machado, Lucila
%D March 1996
%X Although neural networks have been applied to medical
               problems in recent years, their applicability has been
               limited for a variety of reasons. One of those barriers has
               been the problem of recognizing rare categories. In this
               dissertation, I demonstrate, and prove the utility of, a new
               method for tackling this problem. In particular, I have
               developed a method that allows the recognition of rare
               categories with high sensitivity and specificity, and will
               show that it is practical and robust. This method involves
               the construction of sequential neural networks. Rare
               categories occur and must be learned if practical application
               of neural-network technology is to be achieved. Survival
               analysis is one area in which this problem appears. In this
               work, I test the hypotheses that (1) sequential systems of
               neural networks produce results that are more accurate (in
               terms of calibration and resolution) than nonsequential
               neural networks; and (2) in certain circumstances, sequential
               neural networks produce more accurate estimates of survival
               time than Cox proportional hazards and logistic regression
               models. I use two sets of data to test the hypotheses: (1) a
               data set of HIV+ patients; and (2) a data set of patients
               followed prospectively for the development of cardiac
               conditions. I show that a neural network model can predict
               death due to AIDS more accurately than a Cox proportional
               hazards model. Furthermore, I show that a sequential neural
               network model is more accurate than a standard neural network
               model. I show that the predictions of logistic regression and
               neural networks are not significantly different, but that any
               of these models used sequentially is more accurate than its
               standard counterpart.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1564/CS-TR-96-1564.pdf

%R CS-TR-96-1568
%Z Wed, 10 Apr 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Algorithms for computing intersection and union of toleranced
               polygons with applications
%A Cazals, Frederic
%A Ramkumar, G. D. S.
%D April 1996
%X Since mechanical operations are performed only up to a
               certain precision, the geometry of parts involved in real
               life products is never known precisely. Nevertheless,
               operations on toleranced objects have not been studied
               extensively. In this paper, we initiate a study of the
               analysis of the union and intersection of toleranced simple
               polygons. We provide a practical and efficient algorithm that
               stores in an implicit data structure the information
               necessary to answer a request for specific values of the
               tolerances without performing a computation from scratch. If
               the polygons are of sizes m and n, and s is the number of
               intersections between edges occuring for all the combinations
               of tolerance values, the pre-processed data structure takes
               O(s) space and the algorithm that computes a
               union/intersection from it takes O((n+m) log(s) + k' + k
               log(k)) time where k is the number of vertices of the
               union/intersection and k <= k' <= s. Although the algorithm
               is not output sensitive, we show that the expectations of k
               and k' remain within a constant factor tau, a function of the
               input geometry. Finally, we list interesting applications of
               the algorithms related to feasibility of assembly and
               assembly sequencing of real assemblies.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1568/CS-TR-96-1568.pdf

%R CS-TR-96-1569
%Z Thu, 25 Apr 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Using Automatic Abstraction for Problem-Solving and Learning
%A Unruh, Amy
%D April 1996
%X Abstraction is a powerful tool for controlling search
               combinatorics. This research presents a framework for
               automatic abstraction planning, and a family of associated
               abstraction methods, called SPATULA. The framework provides a
               structure within which different parameterized methods for
               automatic abstraction can be instantiated to generate
               abstraction planning behavior, and provides an integrated
               environment for abstract problem-solving and learning.
               A core idea underlying the abstraction techniques is that
               abstraction can arise as an obviation response to impasses in
               planning. Abstraction is performed at problem-solving time
               with respect to impasses in the current problem context, and
               thus the planner generates abstractions in response to
               specific situations. This approach is used to reduce the cost
               of lookahead evaluation searches, by performing abstract
               search in problem spaces which are automatically abstracted
               from the ground spaces during search. New search control
               rules are learned during abstract search; they constitute an
               abstract plan used in future situations, and produce an
               emergent multi-level abstraction behavior. The abstraction
               method has been implemented and evaluated. It has been shown
               to: reduce planning time, while still yielding good
               solutions; reduce learning time; and increase the
               effectiveness of learned rules by enabling them to transfer
               more widely.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1569/CS-TR-96-1569.pdf

%R CS-TR-96-1565
%Z Thu, 04 Apr 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Formal Model for Bridging Heterogeneous Relational
               Databases in Clinical Medicine
%A Sujansky, Walter
%D April 1996
%X This document describes the results of my thesis research,
               which focused on developing a standard query interface to
               heterogenous clinical databases. The high-level goal of this
               work was to *insulate* the developers of clinical computer
               applications from the implementation details of clinical
               databases, thereby facilitating the *sharing* of clinical
               computer applications across institutions with different
               database implementations.
               Most clinical databases store information about patients'
               diagnoses, laboratory results, medication orders, drug
               allergies, and demographic background. These data are
               valuable as the inputs to computer applications that provide
               real-time decision support, monitor the quality of care, and
               analyze data for research purposes. Clinical databases at
               different institutions, however, vary significantly in the
               way the databases model, represent, and retrieve clinical
               data. This database heterogeneity makes it impossible for a
               single computer application to retrieve data from the
               clinical databases of various institutions because the
               database queries included in the application must be
               formulated differently for each institution. Therefore,
               database heterogeneity makes it difficult to share computer
               applications across institutions with different database
               implementations.
               In my work, I have developed an *abstract* model of clinical
               data and an *abstract* query language that allow the
               developers of computer applications to formulate queries
               independently of the institution-specific features of
               clinical databases. I have also developed a database mapping
               language and a formal query-translation method that
               automatically translate the abstract queries that appear in
               applications into equivalent institution-specific queries.
               This framework ostensibly allows copies of a single computer
               application to be distributed to multiple institutions and to
               be customized automatically at each of the institutions such
               that the queries in each copy of the application can retrieve
               data from the local clinical database.
               This dissertation formally describes the abstract data model,
               the abstract query language, the mapping language, and the
               translation algorithm. It also presents the results of a
               formal evaluation that I performed to assess the feasibility
               and utility of this approach for sharing clinical computer
               applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1565/CS-TR-96-1565.pdf

%R CS-TR-96-1566
%Z Tue, 09 Apr 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Clocked Transition Systems
%A Manna, Z ohar
%A Pnueli, Amir
%D April 1996
%X This paper presents a new computational model for real-time
               systems, called the clocked transition system model. The
               model is a development of our previous timed transition
               model, where some of the changes are inspired by the model of
               timed automata. The new model leads to a simpler style of
               temporal specification and verification, requiring no
               extension of the temporal language. For verifying safety
               properties, we present a run-preserving reduction from the
               new real-time model to the untimed model of fair transition
               systems. This reduction allows the (re)use of safety
               verification methods and tools, developed for untimed
               reactive systems, for proving safety properties of real-time
               systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1566/CS-TR-96-1566.pdf

%R CS-TR-96-1570
%Z Wed, 29 May 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Optimization of SQL Queries for Parallel Machines
%A Hasan, Waqar
%D May 1996
%X Parallel execution offers a method for reducing the response
               time of queries against large databases. We address the
               problem of parallel query optimization: Given a declarative
               SQL query, find a procedural parallel plan that delivers the
               query result in minimal time.
               We develop optimization algorithms using models that
               incorporate both sources and obstacles to speedup. We address
               independent, pipelined and partitioned parallelism. We
               incorporate inherent constraints on available parallelism and
               the extra cost of parallel execution. Our models are
               motivated by experiments with NonStop SQL, a commercial
               parallel DBMS.
               We adopt a two-phase approach to parallel query optimization:
               JOQR (join ordering and query rewrite), followed by
               parallelization. JOQR minimizes total work. Then,
               parallelization spreads work among processors to minimize
               response time.
               For JOQR, we model communication costs and abstract physical
               characteristics of data as colors. We devise tree coloring
               and reordering algorithms that are efficient and optimal.
               We model parallelization as scheduling a tree whose nodes
               represent operators and edges represent parallel/precedence
               constraints. Computation/communication costs are represented
               as node/edge weights. We prove worst-case bounds on the
               performance ratios of our algorithms and measure average
               cases using simulation.
               Our results enable the construction of SQL compilers that
               effectively exploit parallel machines.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1570/CS-TR-96-1570.pdf

%R CS-TR-96-1567
%Z Tue, 30 Apr 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Synthesis of Reactive Programs
%A Anuchitanukul, Anuchit
%D April 1996
%X We study various problems of synthesizing reactive programs.
               A reactive program is a program whose behaviors are not
               merely functional relationships between inputs and outputs,
               but sequences of actions as well as interactions between the
               program and its environment. The goal of program synthesis in
               general is to find an implementation of a program such that
               the behaviors of the implementation satisfy a given
               specification.
               The reactive behaviors that we study are omega-regular
               infinite sequences and regular finite sequences. The domain
               of the implementation is (finite) transition systems for
               closed system synthesis, and transition system modules for
               open system synthesis. We consider various solutions, e.g.
               basic, maximal, modular and exact, for any particular
               subclasses of the implementation language and investigate how
               characteristics of the program such as fairness, number of
               processes and composition operations, affect the synthesis
               algorithm. In addition to the automata-theoretic algorithms,
               we give a synthesis algorithm which synthesizes a program
               directly from the linear-time temporal logic ETL.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1567/CS-TR-96-1567.pdf

%R CS-TR-96-1571
%Z Mon, 10 Jun 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Formal Verification of Performance and Reliability of
               Real-Time Systems
%A DeAlfaro, Luca
%D June 1996
%X In this paper we propose a methodology for the specification
               and verification of performance and reliability properties of
               real-time systems within the framework of temporal logic. The
               methodology is based on the system model of stochastic
               real-time systems (SRTSs), and on branching-time temporal
               logics that are extensions of the probabilistic logics pCTL
               and pCTL*. SRTSs are discrete-time transition systems that
               can model both probabilistic and nondeterministic behavior.
               The specification language extends the branching-time logics
               pCTL and pCTL* by introducing an operator to express bounds
               on the average time between events. We present model-checking
               algorithms for the algorithmic verification of system
               specifications, and we discuss their complexity.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1571/CS-TR-96-1571.pdf

%R CS-TR-96-1572
%Z Tue, 18 Jun 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Caching and Non-Horn Inference in Model Elimination Theorem
               Provers
%A Geddis, Donald F.
%D June 1996
%X Caching in an inference procedure holds the promise of
               replacing exponential search with constant-time lookup, at a
               cost of slightly-increased overhead for each node expansion.
               Caching will be useful if subgoals are repeated often enough
               during proofs.
               In experiments on solving queries using a backward chainer on
               Horn theories, caching appears to be very helpful on average.
               When trying to extend this success to first-order theories,
               however, intuition suggests that subgoal caches are no longer
               useful. The cause is that complete first-order backward
               chaining requires goal-goal resolutions in addition to
               resolutions with the database, and this introduces a
               context-sensitivity into the proofs for a subgoal. A cache is
               only feasible if the solutions are independent of context, so
               that they may be copied from one part of the space to
               another.
               It is shown here that a full exploration of a subgoal in one
               context actually provides complete information about the
               solutions to the same subgoal in all other contexts of the
               proof. In a straightforward way, individual solutions from
               one context may be copied over directly. More importantly,
               non-Horn failure caching is also feasible, so no additional
               solutions in the new context (that might affect the query)
               are possible and therefore there is no need to re-explore the
               space in the new context. Thus most Horn clause caching
               schemes may be used with minimal changes in a non-Horn
               setting.
               In addition, a new Horn clause caching scheme is proposed:
               postponement caching. This new scheme involves exploring the
               inference space as a graph instead of as a tree, so that a
               given literal will only occur once in the proof space.
               Despite the previous extension of failure caching to non-Horn
               theories, postponement caching is incomplete in the non-Horn
               case. A counterexample is presented, and possible
               enhancements to reclaim completeness are investigated.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1572/CS-TR-96-1572.pdf

%R CS-TR-96-1573
%Z Wed, 17 Jul 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Depth Discontinuities by Pixel-To-Pixel Stereo
%A Birchfield, Stan
%A Tomasi, Carlo
%D July 1996
%X This report describes a two-pass binocular stereo algorithm
               that is specifically geared towards the detection of depth
               discontinuities. In the first pass, introduced in part I of
               the report, stereo matching is performed independently on
               each epipolar pair for maximum efficiency. In the second
               pass, described in part II, disparity information is
               propagated between the scanlines.
               Part I. Our stereo algorithm explicitly matches the pixels in
               the two images, leaving occluded pixels unpaired. Matching is
               based upon intensity alone without utilizing windows. Since
               the algorithm prefers piecewise constant disparity maps, it
               sacrifices depth accuracy for the sake of crisp boundaries,
               leading to precise localization of the depth discontinuities.
               Three features of the algorithm are worth noting: (1) unlike
               most stereo algorithms, it does not require texture
               throughout the images, making it useful in unmodified indoor
               settings, (2) it uses a measure of pixel dissimilarity that
               is provably insensitive to sampling, and (3) it prunes bad
               nodes during the search, resulting in a running time that is
               faster than that of standard dynamic programming.
               Part II. After the scanlines are processed independently, the
               disparity map is postprocessed, leading to more accurate
               disparities and depth discontinuities. Both the algorithm and
               the postprocessor are fast, producing a dense disparity map
               in about 1.5 microseconds per pixel per disparity on a
               workstation. Results on five stereo pairs are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1573/CS-TR-96-1573.pdf

%R CS-TR-96-1574
%Z Wed, 04 Sep 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Effective Remote Modeling in Large-Scale Distributed
               Simulation and Visualization Environments
%A Singhal, Sandeep K.
%D September 1996
%X A Distributed Interactive Simulation provides the illusion of
               a single, coherent virtual world to a group of users located
               at different machines connected by a network. Networked
               virtual environments are used for multiplayer video games,
               military and industrial training, and collaborative
               engineering. Network bandwidth, network latency, and host
               processing power limit the achievable size and detail of
               future simulations.
               This thesis describes network protocols and algorithms to
               support "remote modeling," allowing a host to model and
               render remote entities in large-scale distributed
               simulations. These techniques require fewer network resources
               and support more entity types than previous approaches. The
               Position History-Based Dead Reckoning (PHBDR) protocol
               provides accurate remote position modeling and minimizes
               dependencies on network performance and entity
               representation. PHBDR is a foundation for three protocols
               which model entity orientation, entity structural change, and
               entity groups.
               This thesis shows that a simple, efficient protocol can
               provide smooth, accurate remote position modeling and that it
               can be applied recursively to support entity orientation,
               structure, and aggregation at multiple levels of detail;
               these protocols offer performance and costs that are
               competitive with more complex and application-specific
               approaches, while providing simpler analyses of behavior by
               exploiting this recursive structure.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1574/CS-TR-96-1574.pdf

%R CS-TR-96-1576
%Z Fri, 06 Dec 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Query Reformulation under Incomplete Mappings
%A Huyn, Nam
%D December 1996
%X This paper focuses on some of the important new
               translatability issues that arise in the problem of
               interoperation between two database schemas when mappings
               between these schemas are inherently more complex than
               traditional views or pure Datalog programs can capture. In
               many cases, sources cannot be redesigned, and mappings among
               them exhibit some form of incompleteness under which the
               question of whether a query can be translated across
               different schemas is not immediately obvious. The notion of
               query we consider here is the traditional one, in which the
               answers to a query are required to be definite: answers
               cannot be disjunctive or conditional and must refer only to
               domain constants. In this paper, mappings are modeled by Horn
               programs that allow existential variables, and queries are
               modeled by pure Datalog programs. We then consider the
               problem of eliminating functional terms from the answers to a
               Horn query where function symbols are allowed. We identify a
               class of Horn queries called "term-bounded" that are
               equivalent to pure Datalog queries. We present an algorithm
               that rewrites a term-bounded query into an "equivalent" pure
               Datalog query. Equivalence is defined here as yielding the
               same function-free answer.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1576/CS-TR-96-1576.pdf

%R CS-TR-96-1575
%Z Wed, 09 Oct 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Routing and Admission Control in General Topology Networks
               with Poisson Arrivals
%A Kamath, Anil
%A Palmon, Omri
%A Plotkin, Serge
%D October 1996
%X Emerging high speed networks will carry traffic for services
               such as video-on-demand and video teleconferencing -- that
               require resource reservation along the path on which the
               traffic is sent. High bandwidth-delay product of these
               networks prevents circuit rerouting, i.e. once a circuit is
               routed on a certain path, the bandwidth taken by this circuit
               remains unavailable for the duration (holding time) of this
               circuit. As a result, such networks will need effective
               routing and admission control strategies.
               Recently developed online routing and admission control
               strategies have logarithmic competitive ratios with respect
               to the admission ratio (the fraction of admitted circuits).
               Such guarantees on performance are rather weak in the most
               interesting case where the rejection ratio of the optimum
               algorithm is very small or even 0. Unfortunately, these
               guarantees can not be improved in the context of the
               considered models, making it impossible to use these models
               to identify algorithms that are going to perform well in
               practice.
               In this paper we develop routing and admission control
               strategies for a more realistic model, where the requests for
               virtual circuits between any two points arrive according to a
               Poisson process and where the circuit holding times are
               exponentially distributed. Our model is close to the one that
               was developed to analyse and tune the (currently used)
               strategies for managing traffic in long-distance telephone
               networks. We strengthen this model by assuming that the rates
               of the Poisson processes (the ``traffic matrix'') are unknown
               to the algorithm and are chosen by the adversary.
               Our strategy is competitive with respect to the expected
               rejection ratio. More precisely, it achieves expected
               rejection ratio of at most R+epsilon, where R is the optimum
               expected rejection ratio. The expectations are taken over the
               distribution of the request sequences, and epsilon=Sqrt(r log
               n), where r is the maximum fraction of an edge bandwidth that
               can be requested by a single circuit.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1575/CS-TR-96-1575.pdf

%R CS-TR-96-1577
%Z Tue, 17 Dec 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A More Aggressive Use Of Views To Extract Information
%A Huyn, Nam
%D December 1996
%X Much recent work has focussed on using views to evaluate
               queries. More specifically, queries are rewritten to refer to
               views instead of the base relations over which the queries
               were originally written. The motivation is that the views
               represent the only ways in which some information source may
               be accessed. Another use of views that has been overlooked
               becomes important especially when no equivalent rewriting of
               a query in terms of views is possible: even though we cannot
               use the views to get all the answers to the query, we can
               still use them to deduce as many answers as possible. In many
               global information applications, the notion of equivalence
               used is often too restrictive. We propose a notion of
               pseudo-equivalence that allows more queries to be rewritten
               usefully: we show that if a query has an equivalent
               rewriting, the query also has a pseudo-equivalent rewriting.
               The converse is not true in general. In particular, when the
               views are conjunctive, we show that all Datalog queries over
               the source do have a pseudo-equivalent Datalog query over the
               views. We reduce the problem of finding pseudo-equivalent
               queries to that of rewriting Horn queries with Skolem
               functions as Datalog queries. We present an algorithm for the
               class of term-bounded Horn queries. We discuss extending the
               problem to larger classes of Horn queries, other non-Horn
               queries that result from ``inverting'' Datalog views and
               adding functional dependencies. The theory and methods
               developed in our work have important uses in query mediation
               between heterogeneous sources, automatic join discovery and
               view updates.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1577/CS-TR-96-1577.pdf

%R CS-TR-96-1578
%Z Tue, 17 Dec 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T State Reduction Methods for Automatic Formal Verification
%A Ip, C. Norris
%D December 1996
%X Validation of industrial designs is becoming more challenging
               as technology advances. One of the most suitable debugging
               aids is automatic formal verification. This thesis presents
               several techniques for reducing the state explosion problem,
               that is, reducing the number of states that are examined.
               A major contribution of this thesis is the design of simple
               extensions to the Murphi description language, which enable
               us to convert two existing abstraction strategies into two
               fully automatic algorithms, making these strategies easy to
               use and safe to apply. These two algorithms rely on two facts
               about high-level designs: they frequently exhibit structural
               symmetry, and their behavior is often independent of the
               exact number of replicated components they contain.
               Another contribution is the design of a new state reduction
               algorithm, which relies on reversible rules (transitions that
               do not lose information) in a system description. This new
               reduction algorithm can be used simultaneously with the other
               two algorithms.
               These techniques, implemented in the Murphi verification
               system, have been applied to many applications, such as cache
               coherence protocols and distributed algorithms. In the cases
               of two important classes of infinite systems, infinite state
               graphs can be automatically converted to small finite state
               graphs.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/96/1578/CS-TR-96-1578.pdf

%R CS-TR-97-1580
%Z Thu, 09 Jan 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T STARTS: Stanford Protocol Proposal for Internet Retrieval and
               Search
%A Gravano, Luis
%A Chang, Kevin
%A Garcia-Molina, Hector
%A Paepcke, Andreas
%D January 1997
%X Document databases are available everywhere, both within the
               internal networks of the organizations and on the Internet.
               The database contents are often "hidden" behind search
               interfaces. These interfaces vary from database to database.
               Also, the algorithms with which the associated search engines
               rank the documents in the query results are usually
               incompatible across databases. Even individual organizations
               use search engines from different vendors to index their
               internal document collections. These organizations could
               benefit from unified query interfaces to multiple search
               engines, for example, that would give users the illusion of a
               single big document database. Building such "metasearchers"
               is nowadays a hard task because different search engines are
               largely incompatible and do not allow for interoperability.
               To improve this situation, the Digital Library project at
               Stanford has coordinated among search-engine vendors and
               other key players to reach informal agreements for unifying
               basic interactions in these three areas. This is the final
               writeup of our informal "standards" effort. This draft is
               based on feedback from people from Excite, Fulcrum, GILS,
               Harvest, Hewlett-Packard Laboratories, Infoseek, Microsoft
               Network, Netscape, PLS, Verity, and WAIS, among others.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1580/CS-TR-97-1580.pdf

%R CS-TR-97-1581
%Z Thu, 09 Jan 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Towards Interoperability in Digital Libraries: Overview and
               Selected Highlights of the Stanford Digital Library Project
%A Paepcke, Andreas
%A Cousins, Steve B.
%A Garcia-Molina, Hector
%A Hassan, Scott W.
%A Ketchpel, Steven K.
%A Roscheisen, Martin
%A Winograd, Terry
%D January 1997
%X We outline the scope of the Stanford Digital Library Project
               which covers five areas: user interface work, technologies
               for locating information and library services, the emerging
               economic perspective of digital libraries, infrastructure
               technology and the use of agent technologies to support all
               of these aspects. We describe technical details for two
               specific efforts that have been realized in prototype
               implementions. First, we describe how we employ distributed
               object technology to move towards an implementation of our
               InfoBus vision. The InfoBus consists of translation services
               and wrappers around existing protocols to cope with the
               problem of interoperability and the distributed nature of
               emerging digital library services. We model autonomous,
               heterogeneous library services as CORBA proxy objects. This
               allows the construction of unified but extensible
               method-based interfaces for client programs to interact
               through. We describe how distributed objects enable the
               design of communication protocols that leave implementors a
               large degree of freedom. This is a benefit because the
               resulting implementations can allow users to choose among
               multiple performance profile tradeoffs while staying within
               the confines of the protocol. The second effort we cover
               describes InterPay which uses the object approach for an
               architecture that helps manage heterogeneity in payment
               mechanisms among autonomous services. The architecture is
               organized into three layers. The top layer contains elements
               involved in the task-level interaction with the services. The
               middle layer is responsible for enforcing user-specified
               payment policies. The lowest layer manages the mechanics of
               diverse online payment schemes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1581/CS-TR-97-1581.pdf

%R CS-TR-97-1582
%Z Thu, 09 Jan 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Shared Web Annotations as a Platform for Third-Party
               Value-Added, Information Providers: Architecture, Protocols,
               and Usage Examples
%A Roscheisen, Martin
%A Mogensen, Christian
%A Winograd, Terry
%D January 1997
%X In this paper, we present an architecture, called
               "ComMentor", which provides a platform for third-party
               providers of lightweight super-structures to material
               provided by conventional content providers. It enables people
               to share structured in-place annotations about arbitrary
               on-line documents. The system is part of a general "virtual
               document" architecture ("PCD BRIO") in which--with the help
               of lightweight distributed meta information--documents are
               dynamically synthesized from distributed sources depending on
               the user context and the meta-information which has been
               attached to them. The meta-information is managed
               independently of the documents themselves on separate
               meta-information servers, both in terms of storage and
               authority. A wide range of useful scenarios can be readily
               realized on this platform. We give examples of how a more
               personalized content presentation can be achieved by
               leveraging the database storage of the uniform
               meta-information and generating documents dynamically
               for a particular user perspective. These include structured
               discussion about paper drafts, collaborative filtering, seals
               of approval, tours, shared "hotlists" with section-based
               visibility control, usage indicators, co-presence, and
               value-added trails. Our object model and request interface for
               the prototype implementation are defined in technical detail
               in the appendix.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1582/CS-TR-97-1582.pdf

%R CS-TR-97-1583
%Z Thu, 09 Jan 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Boolean Query Mapping Across Heterogeneous Information
               Sources (Extended Version)
%A Chang, Kevin Chen-Chuan
%A Garcia-Molina, Hector
%A Paepcke, Andreas
%D January 1997
%X Searching over heterogeneous information sources is difficult
               because of the non-uniform query languages. Our approach is
               to allow a user to compose Boolean queries in one rich
               front-end language. For each user query and target source, we
               transform the user query into a subsuming query that can be
               supported by the source but that may return extra documents.
               The results are then processed by a filter query to yield the
               correct final result. In this paper we introduce the
               architecture and associated algorithms for generating the
               supported subsuming queries and filters. We show that
               generated subsuming queries return a minimal number of
               documents; we also discuss how minimal cost filters can be
               obtained. We have implemented prototype versions of these
               algorithms and demonstrated them on heterogeneous Boolean
               systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1583/CS-TR-97-1583.pdf

%R CS-TR-97-1584
%Z Thu, 09 Jan 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Grassroots: a System Providing a Uniform Framework for
               Communicating, Structuring, Sharing Information, and
               Organizing People
%A Kamiya, Kenichi
%A Roscheisen, Martin
%A Winograd, Terry
%D January 1997
%X People keep pieces of information in diverse collections such
               as folders, hotlists, e-mail inboxes, newsgroups, and mailing
               lists. These collections mediate various types of
               collaborations including communicating, structuring, sharing
               information, and organizing people. Grassroots is a system
               that provides a uniform framework to support people's
               collaborative activities mediated by collections of
               information. The system seamlessly integrates functionalities
               currently found in such disparate systems as e-mail,
               newsgroups, shared hotlists, hierarchical indexes, hypermail,
               etc. Grassroots co-exists with these systems in that its
               users benefit from the uniform image provided by Grassroots,
               but other people can continue using other mechanisms, and
               Grassroots leverages from them. The current Grassroots
               prototype is based on an http-proxy implementation, and can
               be used with any Web browser. In the context of the design of
               a next-generation version of the Web, Grassroots demonstrates
               the utility of a uniform notification infrastructure.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1584/CS-TR-97-1584.pdf

%R CS-TR-97-1585
%Z Thu, 09 Jan 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Techniques and Tools for Making Sense out of Heterogeneous
               Search Service Results
%A Baldonado, Michelle Q. Wang
%A Winograd, Terry
%D January 1997
%X We describe a set of techniques that allows users to interact
               with results at a higher level than the citation level, even
               when those results come from a variety of heterogeneous
               on-line search services. We believe that interactive result
               analysis allows users to "make sense" out of the potentially
               many results that may match the constraints they have
               supplied to the search services. The inspiration for this
               approach comes from reference librarians, who do not respond
               to patrons' questions with lists of citations, but rather
               give high-level answers that are tailored to the patrons'
               needs. We outline here the details of the methods we employ
               in order to meet our goal of allowing for dynamic,
               user-directed abstraction over result sets, as well as the
               prototype tool (SenseMaker) we have built based upon these
               techniques. We also take a brief look at the more general
               theory that underlies the tool, and hypothesize that it is
               applicable to flexible duplicate detection as well.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1585/CS-TR-97-1585.pdf

%R CS-TR-97-1579
%Z Wed, 08 Jan 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T From the Valley of Heart's Delight to Silicon Valley: A Study
               of Stanford University's Role in the Transformation
%A Tajnai, Carolyn
%D January 1997
%X This study examines the role of Stanford University in the
               transformation from the Valley of Heart's Delight to the
               Silicon Valley. At the dawn of the Twentieth Century,
               California's Santa Clara County was an agricultural paradise.
               Because of the benign climate and thousands of acres of fruit
               orchards, the area became known as the Valley of Heart's
               Delight.
               In the early 1890's, Leland and Jane Stanford donated land in
               the valley to build a university in memory of their son.
               Thus, Leland Stanford, Jr., University was founded.
               In the early 1930's, there were almost no jobs for young
               Stanford engineering graduates. This was about to change.
               Although there was no organized plan to help develop the
               economic base of the area around Stanford University, the
               concern about the lack of job opportunities for their
               graduates motivated Stanford faculty to begin the chain of
               events that led to the birth of Silicon Valley.
               Stanford University's role in the transformation of the
               Valley of Heart's Delight into Silicon Valley is history, but
               it is enduring history. Stanford continues to effect the
               local economy by spawning new and creative ideas, dreams, and
               ambitions.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1579/CS-TR-97-1579.pdf

%R CS-TR-97-1586
%Z Mon, 24 Feb 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Construction of a Three-dimensional Geometric Model for
               Segmentation and Visualization of Cervical Spine Images
%A Pichumani, Ramani
%D February 1997
%X This report introduces a new technique for automatically
               extracting vertebral segments from three-dimensional
               computerized tomography (CT) and magnetic resonance (MR)
               images of the human cervical spine. An important motivation
               for this work is to provide accurate information for
               registration and for fusion of CT and MR images into a
               composite three-dimensional image. One of the major hurdles
               in performing image fusion is the difficulty of extracting
               and matching corresponding anatomical regions in an accurate,
               robust, and timely manner. The complementary properties of
               soft and bony tissues revealed in CT and MR imaging
               modalities makes it challenging to extract corresponding
               regions that can be correlated in an accurate and robust
               manner. Ambiguities in the images due to noise, distortion,
               limited resolution, and patient-specific structural
               variations also create additional challenges. Whereas fusion
               of CT and MR images of the cranium have already been
               performed, no one has yet developed an automated technique
               for fusing multimodality images of the spine. Unlike the
               head, which is relatively rigid, the spine is a complex,
               articulating object and is subject to structural deformation
               throughout the multimodal scanning process.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1586/CS-TR-97-1586.pdf

%R CS-TR-97-1587
%Z Mon, 24 Mar 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Ensembles for Supervised Classification Learning
%A Matan, Ofer
%D March 1997
%X This dissertation studies the use of multiple classifiers
               (ensembles or committees) in learning tasks. Both theoretical
               and practical aspects of combining classifiers are studied.
               First we analyze the representational ability of voting
               ensembles. A voting ensemble may perform either better or
               worse than each of its individual members. We give tight
               upper and lower bounds on the classification performance of a
               voting ensemble as a function of the classification
               performances of its individual members.
               Boosting is a method of combining multiple "weak"
               classifiers to form a "strong" classifier. Several issues
               concerning boosting are studied in this thesis. We study SBA,
               a hierarchical boosting algorithm proposed by Schapire, in
               terms of its representation and its search. We present a
               rejection boosting algorithm that trades-off exploration and
               exploitation: It requires fewer pattern labels at the expense
               of lower boosting ability.
               Ensembles may be useful in gaining information. We study
               their use to minimize labeling costs of data and to enable
               improvements on performance over time. For that purpose a
               model for on-site learning is presented. The system learns by
               querying "hard" patterns while classifying "easy" ones.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1587/CS-TR-97-1587.pdf

%R CS-TR-97-1588
%Z Tue, 08 Apr 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Systems of Bilinear Equations
%A Cohen, Scott
%A Tomasi, Carlo
%D April 1997
%X How hard is it to solve a system of bilinear equations? No
               solutions are presented in this report, but the problem is
               posed and some preliminary remarks are made. In particular,
               solving a system of bilinear equations is reduced by a
               suitable transformation of its columns to solving a
               homogeneous system of bilinear equations. In turn, the latter
               has a nontrivial solution if and only if there exist two
               invertible matrices that, when applied to the tensor of the
               coefficients of the system, zero its first column. Matlab
               code is given to manipulate three-dimensional tensors,
               including a procedure that finds one solution to a bilinear
               system often, but not always.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1588/CS-TR-97-1588.pdf

%R CS-TR-97-1589
%Z Thu, 17 Apr 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Learning Action Models for Reactive Autonomous Agents
%A Benson, Scott Sherwood
%D April 1997
%X To be maximally effective, autonomous agents such as robots
               must be able both to react appropriately in dynamic
               environments and to plan new courses of action in novel
               situations. Reliable planning requires accurate models of the
               effects of actions---models which are often more
               appropriately learned through experience than designed. This
               thesis describes TRAIL (Teleo-Reactive Agent with Inductive
               Learning), an integrated agent architecture which learns
               models of actions based on experiences in the environment.
               These action models are then used to create plans that
               combine both goal-directed and reactive behaviors.
               Previous work on action-model learning has focused on domains
               that contain only deterministic, atomic action models that
               explicitly describe all changes that can occur in the
               environment. The thesis extends this previous work to cover
               domains that contain durative actions, continuous variables,
               nondeterministic action effects, and actions taken by other
               agents. Results have been demonstrated in several robot
               simulation environments and the Silicon Graphics, Inc. flight
               simulator.
               The main emphasis in this thesis is on the action-model
               learning process within TRAIL. The agent begins the learning
               process by recording experiences in its environment either by
               observing a trainer or by executing a plan. Second, the agent
               identifies instances of action success or failure during
               these experiences using a new analysis demonstrating nine
               possible causes of action failure. Finally, a variant of the
               Inductive Logic Programming algorithm DINUS is used to induce
               action models based on the action instances. As the action
               models are learned, they can be used for constructing plans
               whose execution contributes to additional learning
               experiences. Diminishing reliance on the teacher signals
               successful convergence of the learning process.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1589/CS-TR-97-1589.pdf

%R CS-TR-97-1590
%Z Tue, 24 Jun 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Complexity Measures for Assembly Sequences
%A Goldwasser, Michael
%D June 1997
%X Our work focuses on various complexity measures for
               two-handed assembly sequences. For many products, there exist
               an exponentially large set of valid sequences, and a natural
               goal is to use automated systems to select wisely from the
               choices. Although there has been a great deal of algorithmic
               success for finding feasible assembly sequences, there has
               been very little success towards optimizing the costs of
               sequences. We attempt to explain this lack of progress, by
               proving the inherent difficulty in finding optimal, or even
               near-optimal, assembly sequences.
               To begin, we define, "virtual assembly sequencing", a
               graph-theoretic problem that is a generalization of assembly
               sequencing, focusing on the combinatorial aspect of the
               family of feasible assembly sequences, while temporarily
               separating out the specific geometric assumptions inherent to
               assembly sequencing. We formally prove the hardness of
               finding even near-optimal sequences for most cost measures in
               our generalized framework. As a special case, we prove
               similar, strong inapproximability results for the problem of
               scheduling with AND/OR precedence constraints. Finally, we
               re-introduce the geometry, and continue by realizing several
               of these hardness results in rather simple geometric
               settings. We are able to show strong inapproximability
               results, for example using an assembly consisting solely of
               unit disks in the plane.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1590/CS-TR-97-1590.pdf

%R CS-TR-97-1595
%Z Fri, 19 Sep 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Maintaining data warehouses under limited source access
%A Huyn, Nam
%D September 1997
%X A data warehouse stores views derived from data that may not
               reside at the warehouse. Using these materialized views, user
               queries can be answered quickly because querying the external
               sources where the base data reside is avoided. However, when
               the sources change, the views in the warehouse can become
               inconsistent with the base data and must be maintained. A
               variety of approaches have been proposed for maintaining
               these views incrementally. At the one end of the spectrum,
               the required view updates are computed without restricting
               which base relations can be used. View maintenance with this
               approach is simple but can be expensive, since it may involve
               querying the external data sources. At the other end of the
               spectrum, additional views are stored at the warehouse to
               make sure that there is enough information to maintain the
               views without ever having to query the data sources. While
               this approach saves on external source access, it may require
               a large amount of information to be stored and maintained at
               the warehouse. In this thesis, we propose an intermediate
               approach to warehouse maintenance based on what we call {\em
               Runtime View Self-Maintenance}, where the views are
               incrementally maintained without using all the base relations
               but without requiring additional views to facilitate
               maintenance. Under limited information, however, maintaining
               a view unambiguously may not always be possible. Thus, the
               main questions in runtime view self-maintenance are:
               - View self-maintainability. Under what conditions (on the
               given information) can a view be maintained unambiguously
               with respect to a given update? - View self-maintenance. If a
               view can be maintained unambiguously, how do we maintain it
               using only the given information?
               The information we consider using for maintaining a view
               includes:
               - At least the contents of the view itself and the update
               instance - Optionally, the contents of other views in the
               warehouse, functional dependencies the base relations are
               known to satisfy, a subset of the base relations, and partial
               contents of a base relation.
               Developing efficient complete solutions for the runtime
               self-maintenance of conjunctive-query views is the main focus
               and the main contribution of this thesis.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1595/CS-TR-97-1595.pdf

%R CS-TR-97-1594
%Z Wed, 17 Sep 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Interval and Point-Based Approaches to Hybrid System
               Verification
%A Kapur, Arjun
%D September 1997
%X Hybrid systems are real-time systems consisting of both
               continuous and discrete components. This thesis presents
               deductive and diagrammatic methodologies for proving
               point-based and interval-based properties of hybrid systems,
               where the hybrid system is modeled in either a sampling
               semantics or a continuous semantics. Under a sampling
               semantics the behavior of the system consists of a discrete
               number of system snapshots, where each snapshot records the
               state of the system at a particular moment in time. Under a
               continuous semantics, the system behavior is given by a
               function mapping each point in time to a system state. Two
               continuous semantics are studied: a continuous interval
               semantics, where at any given point in time the system is in
               a unique state, and a super-dense semantics, where no such
               requirement is needed.
               We use Linear-time Temporal Logic for expressing properties
               under either a sampling semantics or a super-dense semantics,
               and we introduce Hybrid Temporal Logic for expressing
               properties under a continuous interval semantics. Linear-time
               Temporal Logic is useful for expressing point-based
               properties, whose validity is dependent on individual states,
               while Hybrid Temporal Logic is useful for expressing both
               interval-based properties, whose validity is dependent on
               intervals of time, and point-based properties.
               Finally, two different verification methodologies are
               presented: a diagrammatic approach for verifying properties
               specified in Linear-time Temporal Logic, and a deductive
               approach for verifying properties specified in Hybrid
               Temporal Logic.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1594/CS-TR-97-1594.pdf

%R CS-TR-97-1596
%Z Mon, 13 Oct 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Distributed Development of a Logic-Based Controlled Medical
               Terminology
%A Campbell, Keith Eugene
%D October 1997
%X A controlled medical terminology (CMT) encodes clinical data:
               patient's physical signs, symptoms, and diagnoses.
               Application developers lack a robust CMT and the
               methodologies needed to coordinate terminology development
               within and between projects.
               In this dissertation, I argue that if a formal terminology
               model is adopted and integrated into a change-management
               process that supports dynamic CMTs, then CMTs can evolve from
               being an impediment to application development and data
               analysis to a valuable resource.
               My thesis states that such an evolutionary approach can be
               supported by using semantics-based methods for managing
               concurrent terminology development, thereby bypassing the
               disadvantages of traditional lock-based approaches common in
               database systems. By allowing developers to work concurrently
               on the terminology while relying on semantics-based methods
               to resolve the "collisions" that are inevitable in concurrent
               work, a scalable approach to terminology development can be
               supported.
               This dissertation discusses CMT development in terms of three
               research topics:
               1. Representation of Clinical Data
               2. Concurrency Control
               3. Configuration Management
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1596/CS-TR-97-1596.pdf

%R CS-TR-97-1598
%Z Mon, 08 Dec 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Query Planning and Optimization in Information Integration
%A Duschka, Oliver M.
%D December 1997
%X Information integration systems provide uniform user
               interfaces to varieties of different information sources. Our
               work focuses on query planning in such systems. Query
               planning is the task of transforming a user query,
               represented in the user's interface language and vocabulary,
               into queries that can be executed by the information sources.
               Every information source might require a different query
               language and might use different vocabularies.
               We show that query plans with a fixed number of database
               operations are insufficient to extract all information from
               the sources, if functional dependencies or limitations on
               binding patterns are present. Dependencies complicate query
               planning because they allow query plans that would otherwise
               be invalid. We present an algorithm that constructs query
               plans that are guaranteed to extract all available
               information in these more general cases. This algorithm is
               also able to handle datalog user queries.
               We examine further extensions of the languages allowed for
               user queries and for describing information sources:
               disjunction, recursion and negation in source descriptions,
               negation and inequality in user queries. For these more
               expressive cases, we determine the data complexity required
               of languages able to represent "best possible" query plans.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1598/CS-TR-97-1598.pdf

%R CS-TR-97-1600
%Z Mon, 22 Dec 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T An Implementation of a Combinatorial Approximation Algorithm
               for Minimum-Cost Multicommodity Flow
%A Goldberg, Andrew
%A Oldham, Jeffrey D.
%A Plotkin, Serge
%A Stein, Cliff
%D December 1997
%X The minimum-cost multicommodity flow problem involves
               simultaneously shipping multiple commodities through a single
               network so that the total flow obeys arc capacity constraints
               and has minimum cost.
               Multicommodity flow problems can be expressed as linear
               programs, and most theoretical and practical algorithms use
               linear-programming algorithms specialized for the problems'
               structures. Combinatorial approximation algorithms yield
               flows with costs slightly larger than the minimum cost and
               use capacities slightly larger than the given capacities.
               Theoretically, the running times of these algorithms are much
               less than that of linear-programming-based algorithms.
               We combine and modify the theoretical ideas in these
               approximation algorithms to yield a fast, practical
               implementation solving the minimum-cost multicommodity flow
               problem. Experimentally, the algorithm solved our problem
               instances (to 1% accuracy) two to three orders of magnitude
               faster than the linear-programming package CPLEX and the
               linear-programming based multicommodity flow program PPRN.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1600/CS-TR-97-1600.pdf

%R CS-TR-97-1597
%Z Fri, 07 Nov 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Earth Mover's Distance: Lower Bounds and Invariance under
               Translation
%A Cohen, Scott
%A Guibas, Leonidas
%D November 1997
%X The Earth Mover's Distance (EMD) between two finite
               distributions of weight is proportional to the minimum amount
               of work required to transform one distribution into the
               other. Current content-based retrieval work in the Stanford
               Vision Laboratory uses the EMD as a common framework for
               measuring image similarity with respect to color, texture,
               and shape content. In this report, we present some fast to
               compute lower bounds on the EMD which may allow a system to
               avoid exact, more expensive EMD computations during query
               processing. The effectiveness of the lower bounds is tested
               in a color-based retrieval system. In addition to the lower
               bound work, we also show how to compute the EMD under
               translation. In this problem, the points in one distribution
               are free to translate, and the goal is to find a translation
               that minimizes the EMD to the other distribution.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1597/CS-TR-97-1597.pdf

%R CS-TR-97-1599
%Z Wed, 17 Dec 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Trial Banks: An Informatics Foundation for Evidence-Based
               Medicine
%A PhD, Ida Sim, MD,
%D December 1997
%X Randomized clinical trials constitute one of our main sources
               of medical knowledge, yet trial reports are difficult to
               find, read, and apply to clinical care. I propose that
               authors report trials both as entries into electronic
               knowledge bases - or trial banks - and as text articles in
               traditional journals. Trial banks should be interoperable,
               and we thus require a shared ontology of clinical-trial
               concepts.
               My thesis work is the design, implementation, and evaluation
               of such an ontology. Using a new approach called competency
               decomposition, I show that my ontology design is reasonable,
               and that the ontology is competent for three of the four core
               tasks of clinical-trials interpretation for a broad range of
               trial types. Using this ontology, I implemented a frame-based
               trial bank that can be queried dynamically over the World
               Wide Web. Clinical researchers successfully used this system
               to critique trials in the trial bank.
               With the advent of digital publication, we have a window of
               opportunity to design our publication systems such that they
               support the transfer of evidence from the research world to
               the clinic. This dissertation presents foundational work for
               an interoperating trial-bank system that will help us achieve
               the day-to-day practice of evidence-based medicine.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1599/CS-TR-97-1599.pdf

%R CS-TR-97-1592
%Z Tue, 15 Jul 97 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Online Throughput-Competitive Algorithm for Multicast Routing
                and Admission Control
%A Goel, Ashish
%A Henzinger, Monika R.
%A Plotkin, Serge
%D July 1997
%X We present the first polylog-competitive online algorithm for
                the general multicast problem in the throughput model. The
                ratio of the number of requests accepted by the optimum
                offline algorithm to the expected number of requests accepted
                by our algorithm is polylogarithmic in M and n, where M is
                the number of multicast groups and n is the number of nodes
                in the graph. We show that this is close to optimum by
                presenting an Omega(log n log M) lower bound on this ratio
                for any randomized online algorithm against an oblivious
                adversary. We also show that it is impossible to be
                competitive against an adaptive online adversary. 
                As in the previous online routing algorithms, our algorithm
                uses edge-costs when deciding on which is the best path to
                use. In contrast to the previous competitive algorithms in
                the throughput model, our cost is not a direct function of
                the edge load. The new new cost definition allows us to
                decouple the effects of routing and admission decisions of
                different multicast groups.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/97/1592/CS-TR-97-1592.pdf

%R CS-TR-98-1602
%Z Fri, 20 Feb 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Type Systems for Object-Oriented Programming Languages
%A Fisher, Kathleen
%D February 1998
%X Object-oriented programming languages (OOPL's) provide
               important support for today's large-scale software projects.
               Unfortunately, typed OOPL's have suffered from overly
               restrictive type systems that have forced programmers to use
               type-casts to achieve flexibility, a notorious source of
               hard-to-find bugs. One source of this inflexibility is the
               conflation of subtyping and inheritance, which reduces
               potential code reuse. Attempts to fix this rigidity have
               resulted in unsound type systems, most notably Eiffel's.
               This thesis develops a sound type system for a formal
               object-oriented language. It gains flexibility by separating
               subtyping and inheritance and by supporting method
               specialization, which allows the types of methods to be
               refined during inheritance. The lack of such a mechanism is a
               key source of type-casts in languages like C++. Abstraction
               primitives in this formal language support a class construct
               similar to the one found in C++ and Java, explaining the link
               between inheritance and subtyping: object types that include
               implementation information are a form of abstract type, and
               the only way to produce a subtype of an abstract type is via
               inheritance.
               Formally, the language is presented as an object calculus.
               The thesis proves type soundness with respect to an
               operational semantics via a subject reduction theorem.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1602/CS-TR-98-1602.pdf

%R CS-TR-98-1603
%Z Thu, 05 Mar 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Using Complete Machine Simulation to Understand Computer
               System Behavior
%A Herrod, Stephen Alan
%D March 1998
%X This dissertation describes complete machine simulation, a
               novel approach to understanding the behavior of modern
               computer systems. Complete machine simulation models all of
               the hardware found in modern computer systems, allowing it to
               investigate the behavior of highly configurable machines
               running commercial operating systems and important workloads
               such as database and web servers. Complete machine simulation
               extends the applicability of traditional machine simulation
               techniques by addressing speed and data organization
               challenges.
               To achieve the speed needed to investigate long-running
               workloads, complete machine simulation allows an investigator
               to dynamically adjust the characteristics of its hardware
               simulation. An investigator can select a high-speed,
               low-detail simulation setting to quickly pass through
               uninteresting portions of a workload's execution. Once the
               workload has reached a more interesting execution state, an
               investigator can switch to slower, more detailed simulation
               to obtain behavioral information.
               To efficiently organize low-level hardware simulation data
               into more useful information, complete machine simulation
               provides several mechanisms that incorporate higher-level
               workload knowledge into the data management process. These
               mechanisms are efficient and further improve simulation speed
               by customizing all data collection and reporting to the
               specific needs of an investigation.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1603/CS-TR-98-1603.pdf

%R CS-TR-98-1604
%Z Thu, 19 Mar 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Theory and Applications of Steerable Functions
%A Teo, Patrick C.
%D March 1998
%X A function is called steerable if transformed versions of the
               function can be expressed using linear combinations of a
               fixed set of basis functions. In this dissertation, we
               propose a framework, based on Lie group theory, for studying
               and constructing functions steerable under any smooth
               transformation group. Existing analytical approaches to
               steerability are consistently explained within the framework.
               The design of a suitable set of basis functions given any
               arbitrary steerable function is one of the main problems
               concerning steerable functions. To this end, we have
               developed two different algorithms. The first algorithm is a
               symbolic method that derives the minimal set of basis
               functions automatically given an arbitrary steerable
               function. In practice, functions that need to be steered
               might not be steerable with a finite number of basis
               functions. Moreover, it is often the case that only a small
               subset of transformations within the group of transformations
               needs to be considered. In response to these two concerns,
               the second algorithm computes the optimal set of k basis
               functions to steer an arbitrary function under a subset of
               the group of transformations.
               Lastly, we demonstrate the usefulness of steerable functions
               in a variety of applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1604/CS-TR-98-1604.pdf

%R CS-TR-98-1605
%Z Mon, 06 Apr 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Learning to Surf: Multiagent Systems for Adaptive Web Page Recommendation 
%A Balabanovic, Marko
%D March 1998
%X Imagine a newspaper personalized for your tastes.  Instead of a
               selection of articles chosen for a general audience by a human
               editor, a software agent picks items just for you, covering your
               particular topics of interest.  Since there are no journalists
               at its disposal, the agent searches the Web for appropriate 
articles.
               Over time, it uses your feedback on recommended articles to build
               a model of your interests.  This thesis investigates the design
               of "recommender systems" which create such personalized 
newspapers.
               Two research issues motivate this work and distinguish it from 
               approaches usually taken by information retrieval or machine 
learning
               researchers.  First, a recommender system will have many users, 
with
               overlapping interests.  How can this be exploited?  Second, each
               edition of a personalized newspaper consists of a small set of 
               articles.  Techniques for deciding on the relevance of individual 
               articles are well known, but how is the composition of the set 
determined?
               One of the primary contributions of this research is an 
implemented
               architecture linking populations of adaptive software agents.  
Common 
               interests among its users are used both to increase efficiency 
and
               scalability, and to improve the quality of recommendations.  A 
novel
               interface infers document preferences by monitoring user
               drag-and-drop actions, and affords control over the composition 
of 
               sets of recommendations.  Results are presented from a variety of 
               experiments: user tests measuring learning performance, 
simulation
               studies isolating particular tradeoffs, and usability tests 
               investigating interaction designs.               
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1605/CS-TR-98-1605.pdf

%R CS-TR-98-1607
%Z Mon, 18 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Network-Centric Design for Relationship-Based Rights
               Management
%A Roscheisen, Martin
%D May 1998
%X Networked environments such as the Internet provide a new
               platform for communication and information access. In this
               thesis, we address the question of how to articulate and
               enforce boundaries of control on top of this platform, while
               enabling collaboration and sharing in a peer-to-peer
               environment. We develop the concepts and technologies for a
               new Internet service layer, called FIRM, that enables
               structured rights/relationship management. Using a prototype
               implementation, RManage, we show how FIRM makes it possible
               to unify rights/relationship management from a user-centered
               perspective and to support full end-to-end integration of
               shared control state in network services and users' client
               applications. We present a network-centric architecture for
               managing control information, which generalizes previous,
               client/server-based models to a peer-to-peer environment.
               Principles and concepts from contract law are used to
               identify a generic way of representing the shared structure
               of different kinds of relationships.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1607/CS-TR-98-1607.pdf

%R CS-TR-98-1606
%Z Thu, 14 May 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Associative Caching in Client-Server Databases
%A Basu, Julie
%D May 1998
%X Client-server configuration is a popular architecture for
               modern databases. A traditional assumption in such systems is
               that clients have limited resources, and query processing is
               always performed by the server. The server is thus a
               potential performance bottleneck. To improve the system
               performance and scalability, today's powerful clients can
               cache data locally. In this dissertation, we study a new
               scheme, A*Cache, for associative client-side caching. In
               contrast to navigational data access using object or page
               identifiers, A*Cache supports content-based associative
               access for better data reuse. Query results are stored
               locally along with their description, and predicate-based
               reasoning is used to examine and maintain the client cache.
               Clients execute queries locally if the data is cached, and
               use update notifications generated by the server for cache
               maintenance.
               We first describe the architecture of A*Cache and its
               transaction execution model. We then develop new optimization
               techniques for improving the performance of A*Cache. Next,
               A*Cache performance is investigated through detailed
               simulation of a client-server database under many different
               workloads, and compared with other types of caching systems.
               The simulation results clearly demonstrate the effectiveness
               of our associative caching scheme for read-only environments,
               and also for read-write scenarios with moderately high data
               update probabilities.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1606/CS-TR-98-1606.pdf

%R CS-TR-98-1608
%Z Fri, 12 Jun 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A New Perspective on Partial Evaluation and Use Analysis
%A Katz, Morris J.
%D June 1998
%X Partial evaluators are compile time optimizers achieving
               performance improvements through a program modification
               technique called specialization. Partial evaluators produce
               one or more copies, or specializations, of each procedure in
               a source program in the output program. Specializations are
               distinguished by being optimized for invocation from call
               sites with different characteristics, for example, placing
               certain constraints on argument values. Specializations are
               created by partially executing procedures, leaving only
               unexecutable portions as residual code. Symbolic execution
               can replace variable references by the referenced values,
               executed primitives by their computed results, and function
               applications by the bodies of the applied functions, yielding
               inlining. One core challenge of partial evaluation is
               selecting what specializations to create. Attempting to
               produce an infinite number of specializations results in
               divergence. The termination mechanism of a partial evaluator
               decides whether or not to symbolically execute a procedure in
               order to create a new specialization. Creating a termination
               mechanism that precludes divergence is not difficult.
               However, crafting a termination mechanism resulting in the
               production of a sufficient number of appropriate
               specializations to produce high quality residual code while
               still terminating all, or most, of the time is quite
               challenging. This dissertation presents a new type of
               analysis, called use analysis, forming the basis of a
               termination mechanism designed to yield a better combination
               of residual code quality and frequent termination than the
               current state-of-the-art.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1608/CS-TR-98-1608.pdf

%R CS-TR-98-1601
%Z Fri, 12 Jun 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Formal Verification of Probabilistic Systems
%A Alfaro, Luca de
%D June 1998
%X This dissertation presents methods for the formal modeling
               and specification of probabilistic systems, and algorithms
               for the automated verification of these systems. Our system
               models describe the behavior of a system in terms of
               probability, nondeterminism, fairness and time.
               The formal specification languages we consider are based on
               extensions of branching-time temporal logics, and enable the
               expression of single-event and long-run average system
               properties. This latter class of properties, not expressible
               with previous formal languages, includes most of the
               performance properties studied in the field of performance
               evaluation, such as system throughput and average response
               time.
               Our choice of system models and specification languages has
               been guided by the goal of providing efficient verification
               algorithms. The algorithms rely on the theory of Markov
               decision processes, and exploit a connection between the
               graph-theoretical and probabilistic properties of these
               processes. This connection also leads to new results about
               classical problems, such as an extension to the solvable
               cases of the stochastic shortest path problem, an improved
               algorithm for the computation of reachability probabilities,
               and new results on the average reward problem for semi-Markov
               decision processes.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1601/CS-TR-98-1601.pdf

%R CS-TR-98-1609
%Z Tue, 28 Jul 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Automated creation of clinical-practice guidelines from
               decision models
%A Sanders, Gillian D.
%D July 1998
%X I developed an approach that allows clinical-practice
               guideline (CPG) developers to create, disseminate, and tailor
               CPGs, using decision models (DMs). I propose that guideline
               developers can use computer-based DMs that reflect global and
               site-specific data to generate CPGs. Such CPGs are high
               quality, can be tailored to specific settings, and can be
               modified automatically as the DM or evidence evolves.
               I defined conceptual models for representing CPGs and DMs,
               and formalized a method for mapping between these two
               representations. I designed a DM annotation editor that
               queries the decision analyst for missing knowledge. I
               implemented the ALCHEMIST system that encompasses the
               conceptual models, mapping algorithm, and the resulting
               tailoring abilities.
               I evaluated the design of both conceptual models, and the
               accuracy of the mapping algorithm. To show that ALCHEMIST
               produces high-quality CPGs, I had users rate the quality of
               produced CPGs using a guideline-rating key, and evaluate
               ALCHEMIST's tailoring abilities.
               ALCHEMIST automates the DM-to-CPG process and distributes the
               CPG over the web to allow local developers to apply, tailor,
               and maintain a global CPG. I argue that my framework is a
               method for guideline developers to create and maintain
               automated CPGs, and it thus promotes high-quality and
               cost-effective health care.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1609/CS-TR-98-1609.pdf

%R CS-TR-98-1611
%Z Tue, 15 Sep 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Approximation Algorithms for Scheduling Problems
%A Chekuri, Chandra
%D September 1998
%X This thesis describes efficient approximation algorithms for
               some NP-Hard deterministic machine scheduling and related
               problems. We study the objective functions of minimizing
               makespan (the time to complete all jobs) and minimizing
               average completion time in a variety of settings described
               below.
               1. Minimizing average completion time and its weighted
               generalization for single and parallel machine problems. We
               introduce new techniques that either improve earlier results
               and/or result in simple and efficient approximation
               algorithms. In addition to improved results for specific
               problems, we give a general algorithm that converts an x
               approximate single machine schedule into a (2x + 2)
               approximate parallel machine schedule.
               2. Minimizing makespan on machines with different speeds when
               jobs have precedence constraints. We obtain an O(log m)
               approximation (m is the number of machines) in O(n^3) time.
               3. We introduce a class of new scheduling problems that arise
               from query optimization in parallel databases. The novel
               aspect consists of modeling communication costs in query
               execution. We devise algorithms for pipelined operator
               scheduling. We obtain a PTAS and also simpler O(n log n) time
               algorithms with ratios of 3.56 and 2.58.
               4. Multi-dimensional generalizations of three well known
               problems in combinatorial optimization: multi-processor
               scheduling, bin packing, and the knapsack problems. We obtain
               several approximability and inapproximability results.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1611/CS-TR-98-1611.pdf

%R CS-TR-98-1613
%Z Fri, 02 Oct 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T On the synchronization of Poisson processes and queueing
               networks with service and synchronization nodes.
%A Prabhakar, Balaji
%A Bambos, Nicholas     
%A Mountford, Tom
%D October 1998
%X This paper investigates the dynamics of a synchronization
               node in isolation, and of networks of service and
               synchronization nodes. A synchronization node consists of $M$
               infinite capacity buffers, where tokens arriving on $M$
               distinct random input flows are stored (there is one buffer
               for each flow). Tokens are held in the buffers until one is
               available from each flow. When this occurs, a token is drawn
               from each buffer to form a group-token, which is
               instantaneously released as a synchronized departure.
               Under independent Poisson inputs, the output of a
               synchronization node is shown to converge weakly (and in
               certain cases strongly) to a Poisson process with rate equal
               to the minimum rate of the input flows. Hence synchronization
               preserves the Poisson property, as do superposition,
               Bernoulli sampling and M/M/1 queueing operations.
               We then consider networks of synchronization and exponential
               server nodes with Bernoulli routing and exogenous Poisson
               arrivals, extending the standard Jackson Network model to
               include synchronization nodes. It is shown that if the
               synchronization skeleton of the network is acyclic (i.e. no
               token visits any synchronization node twice although it may
               visit a service node repeatedly), then the distribution of
               the joint queue-length process of only the service nodes is
               product form (under standard stability conditions) and easily
               computable. Moreover, the network output flows converge
               weakly to Poisson processes.
               Finally, certain results for networks with finite capacity
               buffers are presented, and the limiting behavior of such
               networks as the buffer capacities become large is studied.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1613/CS-TR-98-1613.pdf

%R CS-TR-98-1614
%Z Fri, 16 Oct 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Decomposing, Transforming and Composing Diagrams: The Joys of
               Modular Verification
%A Alfaro, Luca de
%A Manna, Z ohar
%A Sipma, Henny
%D October 1998
%X The paper proposes a modular framework for the verification
               of temporal logic properties of systems based on the
               deductive transformation and composition of diagrams. The
               diagrams represent abstractions of the modules composing the
               system, together with information about the environment of
               the modules. The proof of a temporal specification is
               constructed with the help of diagram transformation and
               composition rules, which enable the gradual decomposition of
               the system into manageable modules, the study of the modules,
               and the final combination of the diagrams into a proof of the
               specification. We illustrate our methodology with the modular
               verification of a database demarcation protocol.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1614/CS-TR-98-1614.pdf

%R CS-TR-98-1612
%Z Thu, 01 Oct 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Pleiades Project: Collected Work 1997-1998
%A Cervesato, Iliano, (editor) 
%A Mitchell, John C., (editor) 
%D October 1998
%X This report collects the papers that were written by the
               participants of the Pleiades Project and their collaborators
               from April 1997 to August 1998. Its intent is to give the
               reader an overview of our accomplishments during this initial
               phase of the project. Therefore, rather than including
               complete publications, we chose to reproduce only the first
               four pages of each paper. In order to satisfy the legitimate
               curiosity of readers interested in specific articles, each
               paper can be integrally retrieved from the World-Wide Web
               through the provided URL. A list of the current publications
               of the Pleiades Project is accessible at the URL
               http://theory.stanford.edu/muri/papers.html. Future articles
               will be posted there as they become available.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1612/CS-TR-98-1612.pdf

%R CS-TR-98-1615
%Z Tue, 15 Dec 98 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Using Machine Learning to Improve Information Access
%A Sahami, Mehran
%D December 1998
%X We address the problem of topical information space
               navigation. Specifically, we combine query tools with methods
               for automatically creating topic taxonomies in order to
               organize text collections. Our system, named SONIA (Service
               for Organizing Networked Information Autonomously), is
               implemented in the Stanford Digital Libraries testbed. It
               employs several novel probabilistic Machine Learning methods
               that enable the automatic creation of dynamic topic
               hierarchies based on the full-text content of documents.
               First, to generate such topical hierarchies, we employ a
               novel clustering scheme that outperforms traditional methods
               used in both Information Retrieval and Probabilistic
               Reasoning. Furthermore, we develop methods for classifying
               new articles into such automatically generated, or existing
               manually generated, hierarchies. Our method explicitly uses
               the hierarchical relationships between topics to improve
               classification accuracy. Much of this improvement is derived
               from the fact that the classification decisions in such a
               hierarchy can be made by considering only the presence (or
               absence) of a small number of features (words) in each
               document. The choice of relevant words is made using a novel
               information theoretic algorithm for feature selection. The
               algorithms used in SONIA are also general enough to have been
               successfully applied to data mining problems in different
               domains than text.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/98/1615/CS-TR-98-1615.pdf

%R CS-TR-99-1617
%Z Thu, 11 Feb 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Segmentation of Medical Image Volumes Using Intrinsic Shape
               Information
%A Shiffman, Smadar
%D February 1999
%X I propose a novel approach to segmentation of image volumes
               that requires only a small amount of user intervention and
               that does not rely on prior global shape models. The
               approach, intrinsic shape for volume segmentation (IVSeg),
               comprises two methods. T he first method analyzes
               isolabel-contour maps to identify salient regions that
               correspond to major objects. The method detects transitions
               from within objects into the background by matching isolabel
               contours that form along the boundaries of objects as a
               result of multilevel thresholding with a fine partition of
               the intensity range. The second method searches in the entire
               sequence for regions that belong to an object that the user
               selects from one or a few sections. The method uses local
               overlap criter ia to determine whether regions that overlap
               in a given direction (coronal, sagittal, or axial) belong to
               the same object. For extraction of blood vessels, the method
               derives the criteria dynamically by fitting cylinders to
               regions in consecutive sections and computing the expected
               overlap of slices of these cylinders. In a formal evaluation
               study with CTA data, I showed that IVSeg reduced user editing
               time by a factor of 5 without affecting the results in any
               significant way.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1617/CS-TR-99-1617.pdf

%R CS-TR-99-1618
%Z Fri, 26 Mar 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Abstraction-based Deductive-Algorithmic Verification of
               Reactive Systems
%A Uribe, Tomas E.
%D March 1999
%X This thesis presents a framework that combines deductive and
               algorithmic methods for verifying temporal properties of
               reactive systems, to allow more automatic verification of
               general infinite-state systems and the verification of larger
               finite-state ones. Underlying these methods is the theory of
               property-preserving assertion-based abstractions, where a
               finite-state abstraction of the system is deductively
               justified and algorithmically model checked.
               After presenting an abstraction framework that accounts for
               fairness, we describe a method to automatically generate
               finite-state abstractions. We then show how a number of other
               verification methods, including deductive rules,
               (Generalized) Verification Diagrams, and Deductive Model
               Checking, can also be understood as constructing finite-state
               abstractions that are model checked.
               Our analysis leads to a better classification and
               understanding of these verification methods. Furthermore, it
               shows how the different abstractions that they construct can
               be combined. For this, we present an algorithmic Extended
               Model Checking procedure, which uses all the information that
               these methods produce, in a finite-state format that can be
               easily and incrementally combined. Besides a standard safety
               component, the combined abstractions include extra bounds on
               fair transitions, well-founded orders, and constrained
               transition relations for the generation of counterexamples.
               Thus, our approach minimizes the need for user interaction
               and maximizes the impact of the available automated deduction
               and model checking tools. Once proved, verification
               conditions are re-used as much as possible, leaving the
               temporal and combinatorial reasoning to automatic tools.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1618/CS-TR-99-1618.pdf

%R CS-TR-99-1620
%Z Fri, 28 May 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Finding Color and Shape Patterns in Images
%A Cohen, Scott
%D May 1999
%X This thesis is devoted to the Earth Mover's Distance (EMD),
               an edit distance between distributions, and its use within
               content-based image retrieval (CBIR). The major CBIR problem
               discussed is the pattern problem: Given an image and a query
               pattern, determine if the image contains a region which is
               visually similar to the pattern; if so, find at least one
               such image region.
               An important problem that arises in applying the EMD to CBIR
               is the EMD under transformation (EMD_G) problem: find a
               transformation of one distribution which minimizes its EMD to
               another, where the set of allowable transformations G is
               given. The problem of estimating the size/scale at which a
               pattern occurs in an image is phrased and efficiently solved
               as an EMD_G problem. For a large class of transformation
               sets, we also present a monotonically convergent iteration to
               find at least a locally optimal transformation.
               Our pattern problem solution is the SEDL (Scale Estimation
               for Directed Location) image retrieval system. Three
               important contributions of SEDL are (1) a general framework
               for finding both color and shape patterns, (2) the previously
               mentioned scale estimation algorithm using the EMD, and (3) a
               directed (as opposed to exhaustive) search strategy.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1620/CS-TR-99-1620.pdf

%R CS-TR-99-1619
%Z Wed, 07 Apr 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Intelligent Alarms: Allocating Attention Among Concurrent
               Processes
%A Huang, Cecil
%D April 1999
%X I have developed and evaluated a computable, normative
               framework for intelligent alarms: automated agents that
               allocate scarce attention resources to concurrent processes
               in a globally optimal manner. My approach is
               decision-theoretic, and relies on Markov decision processes
               to model time-varying, stochastic systems that respond to
               externally applied actions. Given a collection of continuing
               processes and a specified time horizon, my framework
               computes, for each process: (1) an attention allocation,
               which reflects how much attention the process is awarded, and
               (2) an activation price, which reflects the process's
               priority in receiving the allocated attention amount.
               I have developed a prototype, Simon, that computes these
               alarm signals for a simulated ICU. My validity experiments
               investigate whether sensible input results in sensible
               output. The results show that Simon produces alarm signals
               that are consistent with sound clinical judgment. To assess
               computability, I used Simon to generate alarm signals for an
               ICU that contained 144 simulated patients; the entire
               computation took about 2 seconds on a machine with only
               moderate processing capabilities. I thus conclude that my
               alarm framework is valid and computable, and therefore is
               potentially useful in a real-world ICU setting.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1619/CS-TR-99-1619.pdf

%R CS-TR-99-1623
%Z Mon, 23 Aug 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Efficient Maintenance and Recovery of Data Warehouses
%A Labio, Wilburt Juan
%D August 1999
%X Data warehouses collect data from multiple remote sources and
               integrate the information as materialized views in a local
               database. The materialized views are used to answer queries
               that analyze the collected data for patterns, and trends.
               This type of query processing is often called on-line
               analytical processing (OLAP).
               The warehouse views must be updated when changes are made to
               the remote information sources. Otherwise, the answers to
               OLAP queries are based on stale data. Answering OLAP queries
               based on stale data is clearly a problem especially if OLAP
               queries are used to support critical decisions made by the
               organization that owns the data warehouse. Because the
               primary purpose of the data warehouse is to answer OLAP
               queries, only a limited amount of time and/or resources can
               be devoted to the warehouse update. Hence, we have developed
               new techniques to ensure that the warehouse update can be
               done efficiently.
               Also, the warehouse update is not devoid of failures. Since
               only a limited amount of time and/or resources are devoted to
               the warehouse update, it is most likely infeasible to restart
               the warehouse update from scratch. Thus, we have developed
               new techniques for resuming failed warehouse updates.
               Finally, warehouse updates typically transfer gigabytes of
               data into the warehouse. Although the price of disk storage
               is decreasing, there will be a point in the ``lifetime" of a
               data warehouse when keeping and administering all of the
               collected is unreasonable. Thus, we have investigated
               techniques for reducing the storage cost of a data warehouse
               by selectively ``expiring'' information that is not needed.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1623/CS-TR-99-1623.pdf

%R CS-TR-99-1622
%Z Mon, 23 Aug 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Multicommodity and Generalized Flow Algorithms: Theory and
               Practice
%A Oldham, Jeffrey David
%D August 1999
%X We present several simple, practical, and fast algorithms for
               linear programs, concentrating on network flow problems.
               Since the late 1980s, researchers developed different
               combinatorial approximation algorithms for fractional packing
               problems, obtaining the fastest theoretical running times to
               solve multicommodity minimum-cost and concurrent flow
               problems. A direct implementation of these multicommodity
               flow algorithms was several orders of magnitude slower than
               solving these problems using a commercial linear programming
               solver. Through experimentation, we determined which
               theoretically equivalent constructs are experimentally
               efficient. Guided by theory, we designed and implemented
               practical improvements while maintaining the same worst-case
               complexity bounds. The resulting algorithms solve problems
               orders of magnitude faster than commercial linear programming
               solvers and problems an order of magnitude larger.
               We also present simple, combinatorial algorithms for
               generalized flow problems. These problems generalize ordinary
               network flow problems by specifying a flow multiplier \mu(a)
               for each arc a. Using multipliers permit a flow problem to
               model transforming one type into another, e.g., currency
               exchange, and modification of the amount of flow, e.g., water
               evaporation from canals or accrual of interest in bank
               accounts. First, we show the generalized shortest paths
               problem can be solved using existing network flow ideas,
               i.e., by combining the Bellman-Ford-Moore shortest path
               framework and Megiddo's parametric search. Second, we combine
               this algorithm with fractional packing frameworks to yield
               the first polynomial-time combinatorial approximation
               algorithms for the generalized versions of the
               nonnegative-cost minimum-cost flow, concurrent flow,
               multicommodity maximum flow, and multicommodity
               nonnegative-cost minimum-cost flow problems. These algorithms
               show that generalized concurrent flow and multicommodity
               maximum flow have strongly polynomial approximation
               algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1622/CS-TR-99-1622.pdf

%R CS-TR-99-1625
%Z Thu, 26 Aug 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Pleiades Project: Collected Work 1998-1999
%A Cervesato, Iliano (editor)
%A Mitchell, John C. (editor)
%D August 1999
%X This report collects the papers that were written by the
               participants of the Pleiades Project and their collaborators
               from September 1998 to August 1999. Its intent is to give the
               reader an overview of our accomplishments during this central
               phase of the project. Therefore, rather than including
               complete publications, we chose to reproduce only the first
               four pages of each paper. The papers can be integrally
               retrieved from the World-Wide Web through the provided URLs.
               A list of the current publications of the Pleiades Project is
               accessible at the URL
               http://theory.stanford.edu/muri/papers.html". Future articles
               will be posted there as they become available.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1625/CS-TR-99-1625.pdf

%R CS-TR-99-1621
%Z Mon, 23 Aug 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Perceptual Metrics for Image Database Navigation
%A Rubner, Yossi
%D August 1999
%X The increasing amount of information available in today's
               world raises the need to retrieve relevant data efficiently.
               Unlike text-based retrieval, where keywords are successfully
               used to index into documents, content-based image retrieval
               poses up front the fundamental questions how to extract
               useful image features and how to use them for intuitive
               retrieval. We present a novel approach to the problem of
               navigating through a collection of images for the purpose of
               image retrieval, which leads to a new paradigm for image
               database search. We summarize the appearance of images by
               distributions of color or texture features, and we define a
               metric between any two such distributions. This metric, which
               we call the "Earth Mover's Distance" (EMD), represents the
               least amount of work that is needed to rearrange the mass is
               one distribution in order to obtain the other. We show that
               the EMD matches perceptual dissimilarity better than other
               dissimilarity measures, and argue that it has many desirable
               properties for image retrieval. Using this metric, we employ
               Multi-Dimensional Scaling techniques to embed a group of
               images as points in a two- or three-dimensional Euclidean
               space so that their distances reflect image dissimilarities
               as well as possible. Such geometric embeddings exhibit the
               structure in the image set at hand, allowing the user to
               understand better the result of a database query and to
               refine the query in a perceptually intuitive way. By
               iterating this process, the user can quickly zoom in to the
               portion of the image space of interest. We also apply these
               techniques to other modalities such as mug-shot retrieval.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1621/CS-TR-99-1621.pdf

%R CS-TR-99-1624
%Z Thu, 26 Aug 99 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Non-blocking Synchronization and System Design
%A Greenwald, Michael
%D August 1999
%X Non-blocking synchronization (NBS) has significant advantages
               over blocking synchronization in areas of fault-tolerance,
               system structure, portability, and performance. These
               advantages gain importance with the increased use of
               parallelism and multiprocessors, and as delays increase
               relative to processor speed.
               This thesis demonstrates that non-blocking synchronization is
               practical as the sole co-ordination mechanism in systems by
               showing that careful OS design eases implementation of
               efficient NBS, by demonstrating that DCAS
               (Double-Compare-and-Swap) is the necessary and sufficient
               primitive for implementing NBS, and by demonstrating that
               efficient hardware DCAS is practical for RISC processors.
               This thesis presents high-performance non-blocking
               implementations of common data-structures sufficient to
               implement an operating system kernel. I also present more
               general algorithms: non-blocking implementations of \casn\
               and software transactional memory. Both have overhead
               proportional to the number of writes, support
               multi\--objects, and use a DCAS-based contention-reduction
               technique that is fault-tolerant and OS-independent yet
               performs as well as the best previously published techniques.
               I demonstrate that proposed OS implementations of DCAS are
               inefficient, and propose a design for efficient hardware DCAS
               specific to the R4000 but generalizable to other RISC
               processors.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/99/1624/CS-TR-99-1624.pdf

%R CS-TR-00-1631
%Z Fri, 14 Apr 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Early Vision Using Distributions
%A Ruzon, Mark A.
%D April 2000
%X For over thirty years researchers in computer vision have
               been proposing new methods for performing ``early vision''
               tasks such as detecting edges and corners. One key element
               shared by most methods is that they represent local image
               neighborhoods as constant in color or intensity, with
               deviations modeled as noise. Due to computational
               considerations that encourage the use of small neighborhoods
               where this assumption holds, these methods remain popular.
               This research models a neighborhood as a distribution of
               colors. Our goal is to show that the increase in accuracy of
               this representation translates into higher-quality results
               for early vision tasks on difficult, natural images,
               especially as neighborhood size increases. We emphasize large
               neighborhoods because small ones often do not contain enough
               information. We emphasize color because it subsumes greyscale
               as an image range and because it limits the number of valid
               models we should consider; using only greyscale images allows
               assumptions that do not hold for color.
               We discuss distributions in the context of three related
               image boundary tasks: edge detection, corner detection, and
               estimating alpha, or the percentage with which two colors
               from two objects mix to form the color of a pixel at a
               boundary.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/00/1631/CS-TR-00-1631.pdf

%R CS-TR-00-1630
%Z Fri, 14 Apr 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Dynamic Categorization: A Method for Decreasing Information
               Overload
%A Pratt, Wanda
%D April 2000
%X When people use computer-based tools to find answers to
               general questions, they often are faced with a daunting list
               of search results that are returned by the search engine.
               Many search tools address this problem by helping users to
               make their searches more specific. However, when dozens or
               hundreds of documents are relevant to their question, users
               need tools that help them to explore and to understand their
               search results, rather than ones that eliminate a portion of
               those results. I have developed a new approach, called
               dynamic categorization, that addresses this prob-lem by
               automatically organizing search results into meaningful
               groups that correspond to the user's query. This approach
               uses knowledge of important kinds of queries and a model of
               the domain terminology to generate a hierarchical
               categorization of search results. I implemented this approach
               for the domain of medicine, where the amount of information
               in the primary medical literature alone is overwhelming.
               Results from my evaluation show that a tool based on this
               approach helps users to find answers to those important kinds
               of questions more quickly and easily than when they use a
               relevance-ranking system or a clustering system.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/00/1630/CS-TR-00-1630.pdf

%R CS-TR-00-1632
%Z Fri, 02 Jun 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Finite-State Analysis of Security Protocols
%A Shmatikov, Vitaly
%D June 2000
%X Security protocols are notoriously difficult to design and
               debug. Even if the cryptographic primitives underlying a
               protocol are secure, unexpected interactions between parts of
               the protocol or several instances of the same protocol can
               lead to catastrophic security breaches. Since protocol
               attacks tend to be very subtle, some computer assistance is
               desirable.
               The main contribution of this thesis is to demonstrate how
               fully automatic finite-state techniques can be used to
               analyze a wide variety of security protocols. We present
               several case studies in which we model security protocols as
               finite-state systems, then perform automatic exhaustive state
               search that either discovers an attack, or proves the
               protocol correct subject to the limitations of the model.
               In our first study, we analyze SSL 3.0, a widely used
               Internet security protocol. The second study focuses on
               contract signing protocols designed to guarantee properties
               such as fairness and accountability. All analyses were
               performed using a general-purpose finite-state tool called
               Murphi. To alleviate the state-space explosion problem, we
               develop several state reduction techniques that exploit
               fundamental properties of security protocols. These
               optimizations make analysis of large protocols feasible, and
               establish Murphi as a viable protocol analysis tool.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/00/1632/CS-TR-00-1632.pdf

%R CS-TR-00-1629
%Z Wed, 26 Jul 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Kinetic vertical decomposition trees
%A Comba, Joao Luiz DihlOMBA
%D March 2000
%X This thesis presents a new structure called the Kinetic
               Vertical Decomposition Tree (KVD), used for the dynamic
               maintenance of visibility information for a set of moving
               objects in space. The KVD is a single structure that not only
               (1) allows dynamic maintenance of visibility, but also (2)
               represents a vertical decomposition of the space, (3) allows
               collision detection among moving objects, and (4) it is
               kinetically maintained based on the kinetic data structures
               framework.
               The KVD is a special type of Binary Space Partition tree
               (BSP), a hierarchical data structure commonly used in solid
               modeling and computer graphics for feature classification and
               visibility determination. In the KVD, additional cuts are
               introduced from edges and vertices, so that a vertical
               decomposition is formed. The bounded complexity of the cells
               in this decomposition allows the creation of certificates
               that indicate times when the movement of objects causes a
               change in the decomposition. These certificates are used
               within the framework of kinetic data structures to identify
               when the structure of the KVD changes. The update of the KVD
               involves local changes in the tree, accomplished by special
               update algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/00/1629/CS-TR-00-1629.pdf

%R CS-TR-00-1633
%Z Wed, 16 Aug 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Roma Personal Metadata Service
%A Swierk, Edward
%A Kiciman, Emre
%A Laviano, Vince
%A Baker, Mary
%D August 2000
%X People now have available to them a diversity of digital
               storage devices for their personal files. These devices
               include palmtops, cell phone address books, laptops, desktop
               computers and web-based services. Unfortunately, as the
               number of personal data repositories increases, so does the
               management problem of ensuring that the most up-to-date
               version of any document is available to the user on the
               storage device he is currently using. We introduce the Roma
               personal metadata service to make it easier to locate current
               file versions and ensure their availability across different
               repositories. Roma does this through the use of a
               centralized, available and usually portable metadata store
               used by mobility-aware clients. Separating out the metadata
               store from the respositories eases deployment of the system,
               since it allows us to use existing repositories without
               change. In this paper we describe the design requirements,
               architecture and current prototype implementation of Roma.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/00/1633/CS-TR-00-1633.pdf

%R CS-TR-00-1634
%Z Wed, 30 Aug 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Simulation-Based Search for Hybrid System Control and
               Analysis
%A Neller, Todd William
%D August 2000
%X This dissertation explores new algorithmic approaches to
               simulation-based optimization, game-tree search, and tree
               search for the control and analysis of hybrid systems. Hybrid
               systems are systems that evolve with both discrete and
               continuous behaviors. Examples of hybrid systems include
               diverse mode-switching systems such as those we have used as
               focus problems: stepper motors, magnetic levitation units,
               and submarine detection avoidance scenarios. For hybrid
               systems with complex dynamics, the designer may have little
               other than simulation as a tool to detect design flaws or
               inform offline or real-time control. In approaching control
               and analysis of such systems, we thus limit ourselves to a
               black-box simulation of the system.
               Among our algorithmic contributions are: - the first
               multi-dimensional information-based optimization approach, -
               a generalization of previous multi-level optimization
               methods, - information-based alpha-beta game-tree search, -
               syntheses of cell-mapping and game-tree search techniques, -
               iterative refinement approaches for dynamic action timing
               discretization, - a best-first search variant with dynamic
               time-step refinement, - iterative refinement with an epsilon
               variant of recursive best-first search, and - a dispersion
               technique for dynamic action parameter discretization. We
               also formally define several hybrid system game-tree and tree
               search problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/00/1634/CS-TR-00-1634.pdf

%R CS-TR-00-1635
%Z Thu, 31 Aug 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Change Management and Synchronization of Local and Shared
               Versions of a Controlled Vocabulary
%A Oliver, Diane E.
%D August 2000
%X To share clinical data and to build interoperating computer
               systems that permit data entry, data retrieval, and data
               analysis, users and systems at multiple sites must share a
               common controlled clinical vocabulary (or ontology). However,
               local sites that adopt a shared vocabulary have local needs,
               and local-vocabulary maintainers make changes to the local
               version of that vocabulary. If the local site is motivated to
               conform to the shared vocabulary, then the burden lies with
               the local site to manage its own changes and to incorporate
               changes from the shared version at periodic intervals. I call
               this process synchronization. In this dissertation, I present
               an approach to change management and synchronization of local
               and shared versions of a controlled vocabulary. I describe
               the CONCORDIA model, which comprises a structural model, a
               change model, and a log model to which the shared and local
               vocabularies conform. I demonstrate use of this model in the
               implementation of a synchronization-support tool that
               supports carefully controlled divergence. I evaluated my
               model and methods by performing synchronization on a small
               test set of medical concepts in the subdomain of rickettsial
               diseases. The CONCORDIA model served as an effective approach
               for representation and communication of vocabulary change.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/00/1635/CS-TR-00-1635.pdf

%R CS-TR-00-1636
%Z Thu, 07 Sep 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Design and Analysis of Fast Low Power SRAMs
%A Amrutur, Bharadwaj S.
%D September 2000
%X This thesis explores the design and analysis of Static Random
               Access Memories (SRAMs), focusing on optimizing delay and
               power. The SRAM access path is split into two portions: from
               address input to word line rise (the row decoder) and from
               word line rise to data output (the read data path).
               Techniques to optimize both of these paths are investigated.
               We determine the optimal decoder structure for fast low power
               SRAMs. Optimal decoder implementations result when the
               decoder, excluding the predecoder, is implemented as a binary
               tree. We find that skewed circuit techniques with self
               resetting gates work the best and evaluate some simple sizing
               heuristics for low delay and power. We find that the
               heuristic of using equal fanouts of about 4 per stage works
               well even with interconnect in the decode path, provided the
               interconnect delay is reduced by wire sizing. For fast lower
               power solutions, the heuristic of reducing the sizes of the
               input stage in the higher levels of the decode tree allows
               for good trade-offs between delay and power.
               The key to low power operation in the SRAM data path is to
               reduce the signal swings on the high capacitance nodes like
               the bitlines and the data lines. Clocked voltage sense
               amplifiers are essential for obtaining low sensing power, and
               accurate generation of their sense clock is required for high
               speed operation. We investigate tracking circuits to limit
               bitline and I/O line swings and aid in the generation of the
               sense clock to enable clocked sense amplifiers. The tracking
               circuits essentially use a replica memory cell and a replica
               bitline to track the delay of the memory cell over a wide
               range of process and operating conditions. We present
               experimental results from two different prototypes.
               Finally we look at the scaling trends in the speed and power
               of SRAMs with size and technology and find that the SRAM
               delay scales as the logarithm of its size as long as the
               interconnect delay is negligible. Non-scaling of threshold
               mismatches with process scaling, causes the signal swings in
               the bitlines and data lines also not to scale, leading to an
               increase in the relative delay of an SRAM, across technology
               generations. The wire delay starts becoming important for
               SRAMs beyond the 1Mb generation. Across process shrinks, the
               wire delay becomes worse, and wire redesign has to be done to
               keep the wire delay in the same proportion to the gate delay.
               Hierarchical SRAM structures have enough space over the array
               for using fat wires, and these can be used to control the
               wire delay for 4Mb and smaller designs across process
               shrinks.
%U ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/00/1636/CS-TR-00-1636.pdf

%R CSL-TN-99-1
%Z Fri, 18 Feb 00 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Graph-Oriented Model for Articulation of Ontology Interdependencies
%A Mitra, Prasenjit
%A Wiederhold, Gio
%A Kersten, Martin L.
%D August 1999
%X Ontologies are knowledge structures to explicate the
               contents, essential properties, and relationships 
               between terms in a knowledge source. Many sources are
               now accessible with associated ontologies. Most prior
               work on use of ontologies relies on the construction of
               a single global ontology covering all sources. Such an
               approach is not scalable and maintainable especially when
               the sources change frequently.  We propose a scalable
               and easily maintainable approach based on the 
               interoperation of ontologies. To handle user queries 
               crossing the boundaries of the underlying information
               systems, the interoperation between the ontologies should
               be precisely defined. Our approach is to use rules that cross 
               the semantic gap by creating an articulation or linkage
               between the systems. The rules are generated using a 
               semi-automatic articulation tool with the help of a domain
               expert. To make the ontologies amenable for automatic
               composition based on the accumulated knowledge rules, we
               represent them using a graph-oriented model extended with
               a small algebraic operator set. ONION, a user-friendly toolkit, 
               aids the experts in bridging the semantic gap in real-life
               settings. Our framework provides a sound foundation to 
               simplify the work of domain experts, enables integration 
               with public semantic dictionaries, like Wordnet, and will 
               derive ODMG-compliant mediators automatically. 
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tn/99/1/CSL-TN-99-1.pdf

%R CSL-TR-83-236
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design of a high performance VLSI processor
%A Hennessy, John L.
%A Jouppi, Norman
%A Przybylski, Steven
%A Rowen, Christopher
%A Gross, Thomas
%D February 1983
%X Current VLSI fabrication technology makes it possible to
               design a 32-bit CPU on a single chip. However, to achieve
               high performance from that processor, the architecture and
               implementation must be carefully designed and tuned. The MIPS
               processor incorporates some new architectural ideas into a
               single-chip, nMOS implementation. Processor performance is
               obtained by the careful integration of the software (e.g.,
               compilers), the architecture, and the hardware
               implementation. This integrated view also simplifies the
               design, making it practical to implement the processor at a
               university.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/236/CSL-TR-83-236.pdf

%R CSL-TR-83-240
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T ADAM: an ADA-based language for multiprocessing
%A Luckham, David C
%A von Henke, Frederick W.
%A Larsen, H. J.
%A Stevenson, D. R.
%D May 1983
%X Adam is a high level language for parallel processing. It is
               intended for programming resource scheduling applications, in
               particular supervisory packages for runtime scheduling of
               multiprocessing systems. An important design goal was to
               provide support for implementation of Ada and its runtime
               environment. Adam has been used to implement Ada task
               supervision and also as a high level target language for
               compilation of Ada tasking.
               Adam provides facilities that match the Ada sequential
               constructs (including subprograms, packages, exceptions,
               generics). In addition there are specialized module
               constructs for implementation of packages that may be shared
               between parallel processes. Adam omits the Ada real types but
               includes some new predefined types for scheduling. The
               parallel processing constructs of Adam are more primitive
               than Ada tasking. Some restrictions are enforced on the ways
               in which parallel processes can interact.
               A compiler for Adam has been implemented in MacLisp on DEC
               PDP-10 computers. Runtime support packages in Adam for
               scheduling (on a single CPU) and I/O are also provided. The
               compile contains a library manipulation facility for separate
               compilation.
               The Adam compiler has been used to build an Ada compiler for
               most of the July 1980 Ada language design including task
               types and rendezvous constructs. This was achieved by
               implementing algorithms translating Ada tasking into Adam
               parallel processing as a preprocessor to the Adam compiler.
               This present Ada compiler, which has been operational since
               December 1980, uses a procedure call implementation of
               tasking (due to Haberman and Nassi and to Stevenson). It can
               be easily modified to other implementations. Compilation of
               Ada tasking into a high level target language such as Adam
               facilitates studying questions of correctness and efficiency
               of various compilation algorithms, and code optimizations
               specific to tasking, e.g. elimination of unnecessary thread
               of control.
               This paper gives an overview of Adam and examples of its use.
               Emphasis is placed on the differences from Ada. Experience
               using Adam to build the experimental Ada system is evaluated.
               Design of runtime supervisors in Adam and algorithms for
               translating Ada tasking to Adam processing are discussed in
               detail.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/240/CSL-TR-83-240.pdf

%R CSL-TR-83-242
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Fault simulation using ADLIB-SABLE
%A Ghosh, Sumit
%A vanCleemput, Willem
%D March 1983
%X This technical report presents work in the area of deductive
               fault simulation. This technique, one of the three fault
               simulation techniques discussed in the literature, has been
               implemented in ADLIB-SABLE, a hierarchical multi-level
               simulator designed and used at Stanford University. Most of
               the fault models illustrated in this report consider only two
               fault types: single stuck-at-0 and single stuck-at-Z  (high
               impedance). Gate level fault models have been built for most
               commonly used gates. The ability to model the fault behavior
               of functional blocks in ADLIB-SABLE is also demonstrated. The
               motivation is that for many functional blocks, a gate level
               description may not be available or that the designer wishes
               to sacrifice detailed analysis for a higher simulation speed.
               Functional fault models are built for many commonly used
               blocks, using a decomposition technique. The ratio of
               functional fault simulation speed to gate level fault
               simulation speed has been observed to be of the order of 5
               for the typical functional block sizes considered. The ratio
               however, is not the upper limit and will be larger for
               larger-sized functional blocks. It was also proved that the
               functional fault models are invariant with respect to the
               internal implementation details. A design discipline for
               sequential circuits is worked out which allows deductive
               fault simulation. Extensions to the simple (0,1) deductive
               techniques are studied and the fault models built in the
               extended domain are observed to be useful in modelling gates
               of some technologies. A comparison between deductive and
               concurrent fault simulation methods is given. Performance of
               deductive fault simulation, implemented in ADLIB-SABLE, shows
               that for sequential as well as combinational circuits, the
               CPU time increases linearly with increasing number of
               components simulated, an advantage over fault simulators
               which simulate one fault at a time and display a quadratic
               behavior.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/242/CSL-TR-83-242.pdf

%R CSL-TR-83-244
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T High speed image rasterization using a highly parallel smart bulk memory
%D June 1983
%X VLSI technology allows the efficient realization of a class
               of highly parallel architectures consisting of high density
               semiconductor memory with an on-chip processor which accesses
               the memory in large sections simultaneously. A processor is
               described which uses this architecture to rasterize lines,
               polygons and text quickly, providing the rasterization
               support required in high performance graphic raster displays
               and fast page printers. This on-chip processor translates
               high-level low bandwidth commands into low-level high
               bandwidth actions on chip, where the high bandwidth can be
               tolerated. This architecture is capable of achieving
               performance comparable to the "processor per pixel"
               approaches while avoiding the tremendous density penalty
               incurred by such approaches. Consequently, it is practical to
               build a very high performance high resolution system from a
               small number of these chips.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/244/CSL-TR-83-244.pdf

%R CSL-TR-83-245
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T EDT - a syntax-based program editor reference manual
%A Finlayson, Ross S.
%D July 1983
%X This report describes an experimental syntax-based editor
               that has recently been developed at Stanford. Syntax-based
               editors are unlike conventional text editors in that they use
               knowledge of the syntactic structure of the item (typically a
               program) being edited, to provide "high level" editing
               operations. The editor described in this report is currently
               being used as an editor for programs written in Ada. Other
               programming languages could also be handled, by replacing the
               appropriate language definition files by those for another
               language.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/245/CSL-TR-83-245.pdf

%R CSL-TR-83-247
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Maintaining the time in a distributed system
%A Marzullo, Keith
%A Owicki, Susan
%D August 1983
%X To a client, one of the simplest services provided by a
               distributed system is a time service. A client simply
               requests the time from any set of servers, and uses any
               reply. The simplicity in this interaction, however,
               misrepresents the complexity of implementing such a service.
               An algorithm is needed that will keep a set of clocks
               synchronized, reasonably correct and accurate with respect to
               a standard, and able to withstand errors such as
               communication failures and inaccurate clocks. This paper
               presents a partial solution to the problem by describing two
               algorithms which will keep clocks both correct and
               synchronized.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/247/CSL-TR-83-247.pdf

%R CSL-TR-83-249
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Runtime and description of deadness errors in ADA tasking
%A Helmbold, D.
%A Luckham, David C.
%D November 1983
%X A routine monitoring system for detecting and describing
               tasking errors in Ada programs is presented. Basic concepts
               for classifying tasking errors, called deadness errors, are
               defined. These concepts indicate which aspects of an Ada
               computation must be monitored in order to detect deadness
               errors resulting from attempts to rendezvous or terminate.
               They also provide a basis for the definition and proof of
               correct detection. Descriptions of deadness errors are given
               in terms of the basic concepts.
               The monitoring system has two parts: (1) a separately
               compiled runtime monitor that is added to any Ada source to
               be monitored, and (2) a preprocessor that transforms the Ada
               source so that necessary descriptive data is communicated to
               the monitor at runtime. Some basic preprocessing
               transformations and an abstract monitoring for a limited
               class of errors were previously presented. Here an Ada
               implementation of a monitor and a more extensive set of
               preprocessing transformations are described. This system
               provides an experimental automated tool for detecting
               deadness errors in Ada83 tasking and supplies useful
               diagnostics. The use of the runtime monitor for debugging and
               for programming evasive actions to avoid imminent errors is
               described and examples of experiments are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/249/CSL-TR-83-249.pdf

%R CSL-TR-83-250
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Data buffers for execution architectures
%A Alpert, Donald
%D November 1983
%X Directly Executed Language (DEL) architectures are derived
               from idealized representations of high-level languages. DEL
               architectures show dramatic reduction in the number of
               instructions and memory references executed when compared to
               traditional architectures, offering the design considerations
               for the data buffer in a DEL microprocessor. Simulation
               techniques were used to evaluate the performance of different
               sized buffers for a set of Pascal test programs. The results
               show that a buffer with 256 words typically faults on less
               than 5% of storage allocations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/250/CSL-TR-83-250.pdf

%R CSL-TR-83-251
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T GEM: a tool for concurrency specification and verification
%A Lansky, Amy
%A Owicki, Susan
%D November 1983
%X The GEM model of concurrent computation is presented. Each
               GEM computation consists of a set of partially ordered
               events, and represents a particular concurrent execution.
               Language primitives for concurrency, code segments, as well
               as concurrency problems may be described as logic formulae
               (restrictions) on the domain of possible GEM computations. An
               event-oriented method of program verification is also
               presented. GEM is unique in its ability to easily describe
               and reason about synchronization properties.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/251/CSL-TR-83-251.pdf

%R CSL-TR-83-253
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Evaluation of an interpreted architecture for Pascal on a personal computer
%A Mitchell, Chad Leland
%D December 1983
%X This report describes the design and implementation of an
               interpreter on a personal computer. The architecture
               interpreted was specifically designed for the execution of
               Pascal and belongs to the class of architecture known as
               Direct Correspondence Architectures. The evaluation of the
               interpreter provides information about the suitability of the
               host for this architecture and identifies features of the
               architecture which are not adequately supported by the host.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/83/253/CSL-TR-83-253.pdf

%R CSL-TR-84-256
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Instruction selection by attributed parsing
%A Ganapathi, Mahadevan
%A Fischer, Charles N.
%D February 1984
%X Affix grammars are used to describe the instruction-set of a
               target architecture for purposes of compiler code generation.
               A code generator is obtained automatically for a compiler
               using attributed parsing techniques. A compiler built on this
               model can automatically perform most popular
               machine-dependent optimizations, including peephole
               optimizations. Implementations of code generators based on
               this model exist for the VAX-11, iAPX-86, Z -8000, PDP-ll and
               IBM-370 architectures.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/84/256/CSL-TR-84-256.pdf

%R CSL-TR-84-257
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Reverse synthesis compilation for architectural research
%A Ganapathi, Mahadevan
%A Hennessy, John
%A Sarkar, Vivek
%D March 1984
%X This paper discusses the development of compilation
               strategies for DEL architectures and tools to assist in the
               evaluation of their efficiency. Compilation is divided into a
               series of independent simpler problems. To explore
               optimization of code for DEL compilers, two intermediate
               representations are employed. One of these representations is
               at a lower level than target machine instructions.
               Machine-independent optimization is performed on this
               intermediate representation. The other intermediate
               representation has been specifically designed for compiler
               retargetability It is at a higher level than the target
               machine. Target code generation is performed by reverse
               synthesis followed by attributed parsing. This technique
               demonstrates the feasibility of using automated table-driven
               code generation techniques for inflexible architectures.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/84/257/CSL-TR-84-257.pdf

%R CSL-TR-84-258
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A strongly typed language for specifying programs
%A Henke, Friedrich W. von
%D January 1984
%X A language for specifying and annotating programs is
               presented. The language is intended to be used in connection
               with a strongly typed programming language. It provides a
               framework for the definition of specification concepts and
               the specification of programs by means of assertions and
               annotations. The language includes facilities for defining
               concepts axiomatically and to group definitions of related
               concepts and derived properties (lemmas) in theories. All
               entities in the language are required to be strongly typed;
               however, the language provides a very flexible type system
               which includes polymorphic (or generic) types. The paper
               presents a type checking algorithm for the language and
               discusses the relationship between specification language and
               programming language.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/84/258/CSL-TR-84-258.pdf

%R CSL-TR-84-261
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T ANNA: a language for annotating ADA programs
%A Luckham, David C.
%A Henke, Friedrich W. von
%A Krieg-Brueckner, Bernd
%A Owe, Olaf
%D July 1984
%X ANNA is a proposed language extension of Ada to include
               facilities for formally specifying the intended behavior of
               Ada programs (or portions thereof) at all stages of program
               development. Anna programs are Ada programs extended by
               formal comments. Formal comments in ANNA consist of virtual
               Ada text and annotations. Anna provides annotations for all
               Ada constructs, including declarative annotations (for
               variables, subtypes, subprograms, and packages), statement
               annotations, annotations of generic units, exception
               annotations and visibility annotations. (The current Anna
               design does not include extensions for annotating Ada
               multi-tasking constructs.) Anna also includes a small number
               of new predefined attributes, which may appear only in
               annotations, e.g. the collection attribute of an access type.
               Since all Anna extensions appear as Ada comments, Anna
               programs are also legal Ada programs and acceptable by Ada
               translators. The semantics of annotations are defined in
               terms of Ada concepts; in particular, many kinds of
               annotations are generalizations of the Ada constraint
               concept. This simplifies the training of Ada programmers to
               use Anna for formal specification of Ada programs. Anna
               provides a formal framework within which different theories
               of formal specification may be applied to Ada. This manual
               also describes a translation of annotations into Ada text for
               run-time check of consistency with annotations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/84/261/CSL-TR-84-261.pdf

%R CSL-TR-84-262
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T DEBUGGING ADA TASKING PROBLEMS
%A Helmhold, David
%A Luckham, David
%D July 1984
%X A new class of errors, not found in sequential languages, can
               result when the tasking constructs of Ada are used. These
               errors are called deadness errors and arise when task
               communication fails. Since deadness errors often occur
               intermittently, they are particularly hard to detect and
               diagnose. Previous papers describe the theory and
               implementation of runtime monitors to detect deadness errors
               in tasking programs. The problems of detection and
               description of errors are different. Even when a dead state
               is detected, giving adequate diagnostics that enable the
               programmer to locate its cause in the Ada text is difficult.
               This paper discusses the use of simple diagnostic
               descriptions based on Ada tasking concepts. These diagnostics
               are implemented in an experimental runtime monitor. Similar
               facilities could be implemented in task debuggers in
               forthcoming Ada support environments. Their usefulness and
               shortcomings are illustrated in an example experiment with
               the runtime monitor. Possible future directions in task error
               monitoring and diagnosis based on formal specifications are
               discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/84/262/CSL-TR-84-262.pdf

%R CSL-TR-84-265
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An overview of ANNA - a specification language for ADA
%A Luckham, David
%A Henke, Friedrich W. von
%D September 1984
%X A specification language permits information about various
               aspects of a program to be expressed in a precise machine
               processable form. This information is not normally part of
               the program itself. Specification languages are viewed as
               evolving from modern high level programming languages. The
               first step in this evolution is cautious extension of the
               programming language. Some of the features of Anna, a
               specification language extending Ada, are discussed. The
               extensions include generalizations of constructs (such as
               type constraints) that are already in Ada, and new constructs
               for specifying subprograms, packages, exceptions, and
               contexts.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/84/265/CSL-TR-84-265.pdf

%R CSL-TR-84-259
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Organization and VLSI implementation of MIPS
%A Przybylski, Steven A.
%A Gross, Thomas R.
%A Hennessy, John L.
%A Jouppi, Norman P.
%A Rowen, Christopher
%D April 1984
%X MIPS is an 32-bit, high performance processor architecture
               implemented as an nMOS VLSI chip. The processor uses a low
               level, streamlined instruction set coupled with a fast
               pipeline to achieve an instruction rate of two million
               instructions per second. Close interaction between the
               processor design and compilers for the machine yields
               efficient execution of programs on the chip. Simplifying the
               instruction set and the requirements placed on the hardware
               by the architecture, facilitates both processor control and
               interrupt handling in the pipeline. High speed MOS circuit
               design techniques and a sophisticated timing methodology
               enable the processor to achieve a 250nS clock cycle.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/84/259/CSL-TR-84-259.pdf

%R CSL-TR-85-270
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Model and Temporal Proof System for Networks of Processes
%A Nguyen, Van
%A Gries, David
%A Owicki, Susan
%D February 1985
%X A model and a sound and complete proof system for networks of
               processes in which component processes communicate
               exclusively through messages is given. The model, an
               extension of the trace model, can describe both synchronous
               and asynchronous networks. The proof system uses
               temporal-logic assertions on sequences of observations - a
               generalization of traces. The use of observations traces
               makes the proof system simple, compositional and modular,
               since internal details can be hidden. The expressive power of
               temporal logic makes it possible to prove temporal properties
               (safety, liveness, precedence, etc.) in the system. The proof
               system is language-independent and works for both synchronous
               and asynchronous networks.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/85/270/CSL-TR-85-270.pdf

%R CSL-TR-86-289
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T MIPS-X instruction set and programmer's manual
%A Chow, Paul
%D May 1986
%X MIPS-X is a high performance second generation reduced
               instruction set microprocessor. This document describes the
               visible architecture of the machine, the basic timing of the
               instructions, and the instruction set.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/289/CSL-TR-86-289.pdf

%R CSL-TR-86-298
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Parallel program behavior - specification and abstraction using BDL
%A Yan, Jerry C.
%D August 1986
%X This paper describes the syntax, semantics, and usage of BDL
               - a Behavior Description Language for concurrent programs.
               BDL program models can be used to describe and abstract the
               behavior of real programs formulated in various computation
               paradigms (such as CSP, remote procedures, data-flow, actors,
               etc.). BDL models are constructed from abstract computing
               entities known as "players". The models can behave as closely
               as possible to the actual program in terms of message
               passing, player creation and cpu usage. Although behavior
               abstraction using BDL only involves identifying the
               "redundant part" of the computation and replacing them with
               simple "NO-OP" statements, proper application of this
               technique remains difficult and requires a thorough
               understanding of how the program is architectured. Simulating
               BDL models is much more economical than instruction level
               emulation while program behavior is realistically preserved.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/298/CSL-TR-86-298.pdf

%R CSL-TR-86-300
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An overview of the MIPS-X-MP project
%A Hennessy, John L.
%A Horowitz, Mark A.
%D April 1986
%X MIPS-X-MP is a research project whose end goal is to build a
               small (workstation-sized) multiprocessor with a total
               throughput of 100-200 mips. The architectural approach uses a
               small number (tens) of high performance RISC-based
               microprocessors (10-20 mips each) The multiprocessor
               architecture uses software-controlled cache coherency to
               allow cooperation among processors without sacrificing
               performance of the processors. Software technology for
               automatically decomposing problems to allow the entire
               machine to be concentrated on a single problem is a key
               component of the research. This report surveys the four key
               components of the project: high performance VLSI processor
               architecture and design, multiprocessor architectural
               studies, multiprocessor programming systems, and optimizing
               compiler technology.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/300/CSL-TR-86-300.pdf

%R CSL-TR-86-301
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The complete transformation methodology for sequential runtime
	checking of an ANNA subset
%A Sankar, Sriram
%A Rosenblum, David
%D June 1986
%X We present in this report a complete description of a
               methodology for transformation of Anna (Annotated Ada)
               programs to executable self-checking Ada programs. The
               methodology covers a subset of Anna which allows annotation
               of scalar types and objects. The allowed annotations include
               subtype annotations, subprogram annotations, result
               annotations, object annotations, out annotations and
               statement annotations. Except for package state expressions
               and quantified expressions, the full expression language of
               Anna is allowed in the subset. The transformation of
               annotations to executable checking functions is thoroughly
               illustrated through informal textual description, universal
               checking function templates and several transformation
               examples. We also describe the transformer and related
               software tools used to transform Anna programs. In
               conclusion, we describe validation of the transformer and
               some methods of making the transformation and runtime
               checking processes more efficient.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/301/CSL-TR-86-301.pdf

%R CSL-TR-86-303
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The semantics of timing constructs in hardware description languages
%A Luckham, David C.
%A Huh, Youm
%A Stanculescu, Alec G.
%D August 1986
%X Three different approaches to the representation of time in
               high level hardware design languages are described and
               compared. The first is the timed assignment statement of
               ADLIB/SABLE which anticipates future events. The second is
               the timed assignment of VHDL which predicts future events and
               allows predictions to be preempted by other predictions. The
               third is a new proposed method of expressing time dependency
               by qualifying expressions so that their values are required
               to be constant over a specified time interval. Examples
               comparing these three approaches are given. It is shown how
               time-qualified expressions could be introduced into a
               hardware description language. The possibility of proving
               correctness of hardware models in this language is
               illustrated.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/303/CSL-TR-86-303.pdf

%R CSL-TR-86-306
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Queueing network models for parallel processing of task
	systems: an operational approach
%A Mak, Victor W.K.
%D September 1986
%X Computer performance modeling of possibly complex
               computations running on highly concurrent systems is
               considered. Earlier works in this area either dealt with a
               very simple program structure or resulted in methods with
               exponential complexity. A computationally efficient
               approximate solution method is developed to compute the
               performance measures for series-parallel-reducible task
               systems using queueing network models. Numerical results for
               a number of test cases are presented and compared to those of
               simulations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/306/CSL-TR-86-306.pdf

%R CSL-TR-86-307
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A survey of concurrent architectures
%A Mak, Victor W.K.
%D September 1986
%X A survey of 18 different concurrent architectures is
               presented in this report. Although this is by no means
               complete, it does cover a wide spectrum of both commercial
               and research architectures. A scheme is proposed to describe
               concurrent architectures using different dimensions: models
               of computation, interconnection network, processing element,
               memory system, and application areas.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/307/CSL-TR-86-307.pdf

%R CSL-TR-86-309
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design of testbed and emulation tools
%A Lundstrom, Stephen F.
%A Flynn, Michael J.
%D September 1986
%X The research summarized in this report was concerned with the
               design of testbed and emulation tools suitable to assist in
               projecting, with reasonable accuracy, the expected
               performance of highly concurrent computing systems on large,
               complete applications. Such testbed and emulation tools are
               intended for the eventual use of those exploring new
               concurrent system architectures and organizations, either as
               users or as designers of such systems. While a range of
               alternatives was considered, a software-based set of
               hierarchical tools was chosen to provide maximum flexibility,
               to ease in moving to new computers as technology improves and
               to take advantage of the inherent reliability and
               availability of commercially available computing systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/309/CSL-TR-86-309.pdf

%R CSL-TR-86-310
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Dynamic resource allocation in a hierarchical multiprocessor system - a 
	preliminary report
%D October 1986
%X In this report, an integrated system approach to dynamic
               resource allocation is proposed. Some of the problems in
               dynamic resource allocation and the relationship of these
               problems to system structures are examined. A general dynamic
               resource allocation scheme is presented. A hierarchical
               system architecture which dynamically maps between processor
               structure and programs at multiple levels of instantiations
               is described. Simulation experiments have been conducted to
               study dynamic resource allocation on the proposed system.
               Preliminary evaluation based on
               simple dynamic resource allocation algorithms indicates that
               with the proposed system approach, the complexity of dynamic
               resource management could be significantly reduced while
               achieving reasonably effective dynamic resource allocation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/86/310/CSL-TR-86-310.pdf

%R CSL-TR-87-314
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Post-game analysis -- an initial experiment for heuristic-based 
	resource management in concurrent systems
%A Yan, Jerry C
%D February 1987
%X In concurrent systems, a major responsibility of the resource
               management system is to decide how the application program is
               to be mapped onto the multi-processor. Instead of using
               abstract program and machine models, a generate-and-test
               framework known as "post-game analysis" that is based on data
               gathered during program execution is proposed. Each iteration
               consists of (i) (a simulation of) an execution of the
               program; (ii) analysis of the data gathered; and (iii) the
               proposal of a new mapping that would have a smaller execution
               time. These heuristics are applied to predict execution time
               changes in response to small perturbations applied to the
               current mapping. An initial experiment was carried out using
               simple strategies on "pipeline-like" applications. The
               results obtained from four simple strategies demonstrated
               that for this kind of application, even simple strategies can
               produce acceptable speed-up with a small number of
               iterations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/87/314/CSL-TR-87-314.pdf

%R CSL-TR-87-326
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T SRT division diagrams and their usage in designing intergrated circuits for division
%A Williams, Ted E.
%A Horowitz, Mark
%D November 1986
%X This paper describes the construction and analysis of several
               diagrams which depict SRT division algorithms. These diagrams
               yield insight into the operation of the algorithms and the
               many implementation tradeoffs available in custom circuit
               design. Examples of simple low radix diagrams are shown, as
               well as tables for higher radices. The tables were generated
               by a program which can create and verify the diagrams for
               different division schemes. Also discussed is a custom CMOS
               integrated circuit designed which performs SRT division using
               self-timed circuit techniques. This chip implements an
               intermediate approach between a fully combinational array and
               a fully iterative in time method in order to get both speed
               and small silicon area.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/87/326/CSL-TR-87-326.pdf

%R CSL-TR-87-333
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Managing and measuring two parallel programs on a multiprocessor
%A Yan, Jerry C
%D June 1987
%X Research is being conducted to determine how distributed
               computations can be mapped onto multiprocessors so as to
               minimize execution time. Instead of employing optimization
               techniques based on some abstract program/machine models, the
               approach being investigated here (called "post-game
               analysis") is based on placement heuristics which utilizes
               program execution history. Although initial experiments have
               demonstrated that "post-game analysis" indeed discovered
               mappings that exhibit significantly shorter execution times
               than the worst cases for the programs tested, three important
               issues remain to be addressed: i) the need to evaluate the
               performance of placement heuristics against the "optimal"
               speed-up attainable, ii) to find evidence to help explain why
               these heuristics work and iii) to develop better heuristics
               by understanding how and why the basic set performed well.
               Parallel program execution was simulated using "Axe" -- an
               integrated environment for computation model description,
               processor architecture specification, discrete-time
               simulation and automated data collection. Five groups of
               parameters are measured representing different aspects in the
               concurrent execution environment: (i) overall measurements,
               (ii) communication parameters, (iii) cpu utilization, (iv)
               cpu contention and (v) dependencies between players. Two
               programs were simulated -- a "pipe-line" of players and a
               "divide-and-conquer" program skeleton. The results showed
               that program execution time indeed correlated well with some
               of the parameters measured. It was also shown that
               "post-game" analysis achieved close to 96% optimal speed-up
               for both programs in most cases.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/87/333/CSL-TR-87-333.pdf

%R CSL-TR-87-335
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Allocations of Objects Considered as Nondeterministic
               Expressions - Towards a More Abstract Axiomatics of Access
               Types
%A Meldal, Sigurd
%D September 1987
%X The concept of access ("reference" or "pointer") values is
               formalized as parametrized abstract data types, using the
               axiomatic method of Guttag and Horning as extended by Owe.
               Two formalizations are given. The first is a formalization of
               the approach used in the definition of a partial correctness
               system for Pascal by Hoare and Wirth. Its lack of abstraction
               is pointed out. This is caused by the annotation language
               being too expressive. An approach is taken which results in a
               more abstract system: The expressiveness of the annotation
               language is reduced and the allocation operator is viewed as
               a nondeterministic expression. This reinterpretation of the
               program language results in an appropriate level of
               abstraction of the proof system.
               An example is given, verification of a package defining a set
               type.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/87/335/CSL-TR-87-335.pdf

%R CSL-TR-87-337
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design of testbed and emulation tools
%A Flynn, Michael J.
%A Lundstrom, Stephen
%D October 1987
%X In order to understand how to predict the performance of
               concurrent computing systems, an experimental environment is
               needed. The purpose of the research conducted under the grant
               was to investigate various aspects of this environment. A
               first performance prediction system was developed and
               evaluated (by comparison both with simulations and with
               actual systems). The creation of a second, complementary
               system is well underway.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/87/337/CSL-TR-87-337.pdf

%R CSL-TR-87-339
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T MIPS-X: the external interface
%A Salz, Arturo
%A Agarwal, Anant
%A Chow, Paul
%D November 1987
%X MIPS-X is a 20-MIPS-peak VLSI processor designed at Stanford
               University. This document describes the external interface of
               MIPS-X and the organization of the MIPS-X processor system,
               including the external cache and coprocessors. The external
               interface has been designed to optimize the paths between the
               processor, the external cache and the coprocessors. The
               signals used by the processor and their timing are documented
               here. Signal use and timings during exceptions and cache
               misses are also shown.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/87/339/CSL-TR-87-339.pdf

%R CSL-TR-87-342
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Interprocedural analysis useless for code optimization
%A Richardson, S.
%A Ganapathi, M.
%D November 1987
%X The problem of tracking data flow across procedure boundaries
               has a long history of theoretical study by people who
               believed that such information would be useful for code
               optimization. Building upon previous work, we have
               implemented an algorithm for interprocedural data flow
               analysis. The algorithm produces three flow-insensitive
               summary sets: MOD, USE, and ALIASES. The utility of the
               resulting information was investigated using an optimizing
               Pascal compiler. Over a sampling of 27 benchmarks, we found
               that additional optimizations performed as a result of
               interprocedural summary information contributed almost
               nothing to program execution speed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/87/342/CSL-TR-87-342.pdf

%R CSL-TR-87-338
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Sparse, distributed memory prototpe: principles of operation
%A Flynn, Michael J.
%A Kanerva, Pentti
%A Ahanin, Bahram
%A Flaherty, Paul A.
%A Hickey, Philip
%A Bhadkamkar, Neal A.
%D February 1988
%X Sparse distributed memory is a generalized random-access
               memory (RAM) for long (e.g., 1,000 bit) binary words. Such
               words can be written into and read from the memory, and they
               can also be used to address the memory. The main attribute of
               the memory is sensitivity to similarity, meaning that a word
               can be read back not only by giving the original write
               address but also by giving one close to it as measured by the
               Hamming distance between addresses.
               Large memories of this kind are expected to have wide use in
               speech and scene analysis, in signal detection and
               verification, and in adaptive control of automated
               equipment---in general, in dealing with real-world
               information in real time.
               The memory can be realized as a simple, massively parallel
               computer. Digital technology has reached a point where
               building large memories is becoming practical. This research
               project is aimed at resolving major design issues that have
               to be faced in building the memories. This report describes
               the design of a prototype memory with 256-bit addresses and
               from 8K to 128K locations for 256-bit words. A key aspect of
               the design is extensive use of dynamic RAM and other standard
               components.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/87/338/CSL-TR-87-338.pdf

%R CSL-TR-88-347
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Trace compaction using cache filtering with blocking
%A Agarwal, Anant
%D December 1987
%X Trace-driven simulation is a popular method of estimating the
               performance of cache memories, translation lookaside buffers,
               and paging schemes. Because the cost of trace-driven
               simulation is directly proportional to trace length, reducing
               the number of references in the trace significantly impacts
               simulation time. This paper concentrates on trace-driven
               simulation for cache analysis. A technique called cache
               filtering with blocking is presented that compresses traces
               by exploiting both the temporal and spatial locality in the
               trace. Experimental results show that this scheme can reduce
               trace length by nearly two orders of magnitude while
               introducing less than 15% error in cache miss rate estimates.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/347/CSL-TR-88-347.pdf

%R CSL-TR-88-348
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Thor user's manual: tutorial and commands
%A Alverson, Robert
%A Blank, Tom
%A Choi, Kiyoung
%A Hwang, Sun Young
%A Salz, Arturo
%A Soule, Larry
%A Rokicki, Thomas
%D January 1988
%X THOR is a behavioral simulation environment intended for use
               with digital circuits at either the gate, register transfer,
               or functional levels. Models are written in the CHDL modeling
               language (a hardware description language based on the C
               programming language.) Network descriptions are written in
               the CSL language supporting hierarchical network
               descriptions. Using interactive mode, batch mode, or both
               combined, a variety of commands are available to control
               execution. Simulation output can be viewed in tabular format
               or in waveforms. A library of components and a toolbox for
               building simulation models are also provided. Other tools
               include CSLIM, used to generate boolean equations directly
               from THOR models and an interface to other simulators (e.g.
               RSIM and a physical chip tester) so that two simulations can
               be run concurrently verifying equivalent operation.
               This technical report is part one of two parts and is
               formatted similar to UNIX manuals. Part one contains the THOR
               tutorial and all the commands associated with THOR. Part two
               contains descriptions of the general purpose functions used
               in models, the parts library including many TTL components,
               and the logic analyzer model.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/348/CSL-TR-88-348.pdf

%R CSL-TR-88-349
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Thor user's manual: library functions
%A Alverson, Robert
%A Blank, Tom
%A Choi, Kiyoung
%A Hwang, Sun Young
%A Salz, Arturo
%A Soule, Larry
%A Rokicki, Thomas
%D January 1988
%X THOR is a behavioral simulation environment intended for use
               with digital circuits at either the gate, register transfer,
               or functional levels. Models are written in the CHDL modeling
               language (a hardware description language based on the "C"
               programming language). Network descriptions are written in
               the CSL language supporting hierarchical network
               descriptions. Using interactive mode, batch mode or both
               combined, a variety of commands are available to control
               execution. Simulation output can be viewed in tabular format
               or in waveforms. A library of components and a toolbox for
               building simulation models are also provided. Other tools
               include CSLIM, used to generate boolean equations directly
               from THOR models and an interface to other simulators (e.g.
               RSIM and a physical chip tester) so that two simulations can
               be run concurrently verifying equivalent operation.
               This technical report is part one of two parts and is
               formatted similar to UNIX manuals. Part one contains the THOR
               tutorial and all the commands associated with THOR. Part two
               contains descriptions of the general purpose functions used
               in models, the parts library including many TTL components,
               and the logic analyzer model.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/349/CSL-TR-88-349.pdf

%R CSL-TR-88-350
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The ILSP behavioral description language and its graph
	representation for behavioral synthesis
%A Odani, Masayasu
%A Hwang, Sun Young
%A Blank, Tom
%A Rokicki, Thomas
%D March 1988
%X This report describes the ILSP behavioral description
               language and its internal representation employed in the
               Hermod behavioral synthesis system. Using combined control
               and data flow graph C/DFG as an intermediate representation,
               the Hermod system generates hardware modules and their
               interconnections from behavioral descriptions. The Hermod
               system is included in an integrated environment for hardware
               simulation and synthesis under development at Stanford
               University. The functional models written in the ILSP can be
               simulated on the THOR logic/functional/behavioral simulator
               without translation. After proper verification of its
               behavior, an ILSP model can be input to the synthesizer for
               compilation into an RT-level description.
               This report consists of two parts: the specification of the
               ILSP language and its graph representation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/350/CSL-TR-88-350.pdf

%R CSL-TR-88-355
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Introductory user's guide to the architect's workbench tools
%A Torrellas, Josep
%A Bray, Brian
%A Cuderman, Kathy
%A Goldschmidt, Stephen
%A Kobrin, Alan
%A Z immerman, Andrew
%D May 1988
%X The Architect's Workbench is a set of simulation tools to
               provide insight on how the instruction set and the
               organization of registers and cache affect processor-memory
               traffic and, as a result, processor performance. This report
               is designed to be an introductory guide to the tools for the
               novice user.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/355/CSL-TR-88-355.pdf

%R CSL-TR-88-358
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Interviews: A C++ graphical interface toolkit
%A Linton, Mark A.
%A Calder, Paul R.
%A Vlissides, John M.
%D July 1988
%X We have implemented an object-oriented user interface
               package, called InterViews, that supports the composition of
               a graphical user interface from a set of interactive objects.
               The base class for interactive objects, called an interactor,
               and base class for composite objects, called a scene, define
               a protocol for combining interactive behaviors. Subclasses of
               scenes define common types of composition: a box tiles its
               components, a tray allows components to overlap or constrain
               each other's placement, a deck stacks its components so that
               only one is visible, a frame adds a border, and a viewport
               shows part of a component. Predefined components include
               menus, scrollers, buttons, and text editors. InterViews also
               includes classes for structured text and graphics. InterViews
               is written in C++ and runs on top of the X window system.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/358/CSL-TR-88-358.pdf

%R CSL-TR-88-364
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Applying object-oriented design to structured graphics
%A Vlissides, John M.
%A Linton, Mark A.
%D August 1988
%X Structured graphics are useful for building applications that
               use a direct manipulation metaphor. Object-oriented languages
               offer inheritance, encapsulation, and runtime binding of
               operations to objects. Unfortunately, standard structured
               graphics packages do not use an object-oriented model, and
               object-oriented systems do not provide general-purpose
               structured graphics, relying instead on low-level graphics
               primitives. An object-oriented approach to structured
               graphics can give application programmers the benefits of
               both paradigms.
               We have implemented a two-dimensional structured graphics
               library in C++ that presents an object-oriented model to the
               programmer. The graphic class defines a general graphical
               object from which all others are derived. The picture
               subclass supports hierarchical composition of graphics.
               Programmers can define new graphical objects either
               statically by subclassing or dynamically by composing
               instances of existing classes. We have used both this library
               and an earlier, non-object-oriented library to implement a
               MacDraw-like drawing editor. We discuss the fundamentals of
               the object-oriented design and its advantages based on our
               experiences with both libraries.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/364/CSL-TR-88-364.pdf

%R CSL-TR-88-367
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An overview of VAL
%A Augustin, Larry M.
%A Gennart, Benoit A.
%A Huh, Youm
%A Luckham, David C.
%A Stanculescu, Alec G.
%D October 1988
%X VAL (VHDL Annotation Language) provides a small number of new
               language constructs to annotate VHDL hardware descriptions.
               VAL annotations, added to the VHDL entity declaration in the
               form of formal comments, express intended behavior common to
               all architectural bodies of the entity. Annotations are
               expressed as parallel processes that accept streams of input
               signals and generate constraints on output streams. VAL views
               signals as streams of values ordered by time. Generalized
               timing expressions allow the designer to refer to relative
               points on a stream. No concept of preemptive delayed
               assignment or inertial delay are needed when referring to
               different relative points in time on a stream. The VAL
               abstract state model permits abstract data types to be used
               in specifying history dependent device behavior. Annotations
               placed inside a VHDL architecture define detailed
               correspondences between the behavior specification and
               architecture. The result is a simple but expressive language
               extension of VHDL with possible applications to automatic
               checking of VHDL simulations, hierarchical design, and
               automatic verification of hardware designs in VHDL.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/367/CSL-TR-88-367.pdf

%R CSL-TR-88-369
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Composing user interfaces with interviews
%A Linton, Mark A.
%A Vlissides, John M.
%A Calder, Paul R.
%D November 1988
%X In this paper we show how to compose user interfaces with
               InterViews, a user interface toolkit we have developed at
               Stanford. InterViews provides a library of predefined objects
               and a set of protocols for composing them. A user interface
               is created by composing simple primitives in a hierarchical
               fashion, allowing complex user interfaces to be implemented
               easily. InterViews supports the composition of interactive
               objects (such as scroll bars and menus), text objects such as
               words and whitespace, and graphics objects such as circles
               and polygons. To illustrate how InterViews composition
               mechanisms facilitate the implementation of user interfaces,
               we present three simple applications: a dialog box built from
               interactive objects, a drawing editor using a hierarchy of
               graphical objects, and a class browser using a hierarchy of
               text objects. We also describe how InterViews supports
               consistency across applications as well as end-user
               customization.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/369/CSL-TR-88-369.pdf

%R CSL-TR-88-373
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Sparse distributed memory prototype: address module hardware guide
%A Flynn, M. J.
%A Z eidman, R.
%A Lochner, E.
%D December 1988
%X This document is a detailed specification of the hardware
               design of the Address Module for the prototype Sparse
               Distributed Memory. It contains all of the information needed
               to build, test, debug, modify and operate the Address Module.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/88/373/CSL-TR-88-373.pdf

%R CSL-TR-89-378
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Analysis of Parallelism and Deadlocks in Distributed-Time
               Logic Simulation
%A Soule, Larry
%A Gupta, Anoop
%D May 1989
%X This paper explores the suitability of the Chandy-Misra
               algorithm for digital logic simulation. We use four realistic
               circuits as benchmarks for our analysis, with one of them
               being the vector-unit controller for the Titan supercomputer
               from Ardent. Our results show that the average number of
               logic elements available for concurrent execution ranges from
               10 to 111 for the four circuits, with an overall average of
               68. Although this is twice as much parallelism as that
               obtained by traditional event-driven algorithms for these
               circuits, we feel it is still too low. One major factor
               limiting concurrency is the large number of global
               synchronization points --- "deadlocks" in the Chandy-Misra
               terminology --- that occur during execution. Towards the goal
               of reducing the number of deadlocks, the paper presents a
               classification of the types of deadlocks that occur during
               digital logic simulation. Four different types are identified
               and described intuitively in terms of circuit structure.
               Using domain specific knowledge, the paper proposes methods
               for reducing these deadlock occurrences. For one of the
               benchmark circuits, the use of the proposed techniques
               eliminated all deadlocks and increased the average
               parallelism from 40 to 160. We believe that the use of such
               domain knowledge will make the Chandy-Misra algorithm
               significantly more effective than it would be in its generic
               form.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/378/CSL-TR-89-378.pdf

%R CSL-TR-89-379
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Two Dimensional Pinpointing: An Application of Formal
               Specification to Debugging Packages
%A Luckham, David
%D April 1989
%X New methods of testing and debugging software utilizing
               high-level formal specifications are presented. These methods
               require a new generation of support tools. Such tools must be
               capable of automatically comparing the runtime behavior of
               hierarchically structured software with high-level
               specifications; they must provide information about
               inconsistencies in terms of abstractions used in
               specifications.
               This use of specifications has several advantages over
               present-day debugging methods: (1) the debugging problem
               itself is precisely defined by specifications; (2) violations
               of specifications are detected automatically, thus
               eliminating the need to search output traces and recognize
               errors manually; (3) complex tests, such as tests for
               side-effects on global data, can be made easily; (4) the new
               methods are independent of any compiler and runtime
               environment for a programming language; (5) they apply
               generally to hierarchically structured software --- e.g.,
               packages containing nested units, (6) they also apply to
               other life-cycle processes such as analysis of prototypes,
               and the use of prototypes to build formal specifications.
               In this paper a particular process for locating errors in
               software packages, called two dimensional pinpointing, is
               described. Tests consist of sequences of package operations
               (first dimension). Specifications at the highest (most
               abstract) level are checked first. If violations occur then
               new specifications are added if possible, otherwise checking
               of specifications at the next lower level (second dimension)
               is activated. Violation of a new specification provides more
               information about the error which reduces the region of
               program text under suspicion. All interaction between
               programmer and toolset is phrased in terms of the concepts
               used to specify the program.
               Two dimensional pinpointing is presented using the Anna
               specification language for Ada programs. Anna and a toolset
               for comparing behavior of Ada programs with Anna
               specifications is described. Pinpointing techniques are then
               illustrated by examples. The examples involve debugging of
               Ada packages, for which Anna provides a rich set of
               specification constructs. The Anna toolset supports use of
               the methodology on the full Ada/Anna languages, and is being
               engineered to commercial standards.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/379/CSL-TR-89-379.pdf

%R CSL-TR-89-380
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Unidraw: A Framework for Building Domain-Specific Graphical
               Editors
%A Vlissides, John M.
%A Linton, Mark A.
%D July 1989
%X Unidraw is a framework for creating object-oriented graphical
               editors in domains such as technical and artistic drawing,
               music composition, and CAD. The Unidraw architecture
               simplifies the construction of these editors by providing
               programming abstractions that are common across domains.
               Unidraw defines four basic abstractions: components
               encapsulate the appearance and semantics of objects in a
               domain, tools support direct manipulation of components,
               commands define operations on components and other objects,
               and external representations define the mapping between
               components and the file format generated by the editor.
               Unidraw also supports multiple views, graphical connectivity
               and confinement, and dataflow between components. This paper
               describes the Unidraw design, implementation issues, and
               three prototype domain-specific editors we have developed
               with Unidraw: a drawing editor, a user interface builder, and
               a schematic capture system. Experience indicates a
               substantial reduction in implementation time compared with
               existing tools.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/380/CSL-TR-89-380.pdf

%R CSL-TR-89-387
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Specification and automatic verification of self-timed queues
%A Dill, David L.
%A Nowick, Steven M.
%A Sproull, Robert F.
%D August 1989
%X Speed-independent circuit design is of increasing interest
               because of global timing problems in VLSI. Unfortunately,
               speed-independent design is very subtle. We propose the use
               of state-machine verification tools to ameliorate this
               problem. This paper illustrates issues in the modelling,
               specification, and verification of speed-independent circuits
               through consideration of self-timed queues. User-level
               specifications are given as Petri nets, which are translated
               into trace structures for automatic processing. Three
               different implementations of queues are considered: a chain
               of queue cells, two parallel chains, and "circular buffer"
               example using a separate RAM.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/387/CSL-TR-89-387.pdf

%R CSL-TR-89-390
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T VAL to VHDL transformer: an implementation guide
%A Augustin, Larry M.
%A Gennart, Benoit A.
%A Huh, Youm
%A Luckham, David C.
%A Sahai, Bob
%A Stanculescu, Alec G.
%D September 1989
%X This report presents one implementation of the VAL semantics.
               It is based on a transformation from VAL annotated VHDL to
               self-checking VHDL that is equivalent to the original source
               from the simulation semantics standpoint.
               The transformation is performed as a sequence of tree to tree
               transformations. The report describes the semantic preserving
               transformations, as well as the structure of the transformer.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/390/CSL-TR-89-390.pdf

%R CSL-TR-89-395
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design of Run Time Monitors for Concurrent Programs
%A Helmbold, David
%A Bryan, Doug
%D October 1989
%X We address the problem of correctly monitoring the run time
               behavior of a concurrent program. We view a program as having
               three (potentially different) sets of behavior: computations
               of the original program when monitoring is not performed,
               computations after the monitor is added to the program, and
               "observations'' produced by the monitor. Using these sets of
               behaviors, we define four properties of monitor systems:
               non-interference, safety, accuracy and correctness. We define
               both a minimal level and a total level for each of these
               properties. The non-interference and safety properties
               address the degree to which the presence of the monitor
               alters a computation (the differences between the first two
               sets of computations). Accuracy is a relationship between a
               monitored computation and the observation of the computation
               produced by the monitor. Correctness is a relationship
               between observations and the unmonitored computations.
               A run time monitor for TSL-1 and Ada has been implemented.
               This monitor system uses two techniques for constructing the
               observation. We show that any monitoring system using these
               two techniques is at least minimally correct, from which the
               (minimal) correctness of the TSL-1 monitor follows.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/395/CSL-TR-89-395.pdf

%R CSL-TR-89-396
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T COOL: a language for parallel programming
%A Chandra, Rohit
%A Gupta, Anoop
%A Hennessy., John L.
%D October 1989
%X We present COOL, an object-oriented parallel language derived
               from C++ by adding constructs to specify concurrent
               execution. We describe the language design, and the
               facilities for creating parallelism, performing
               synchronization, and communicating. The parallel construct is
               parallel functions that execute asynchronously.
               Synchronization support includes mutex functions and future
               types. A shared-memory model is assumed for parallel
               execution, and all communication is through shared-memory.
               The parallel programming model of COOL has proved useful in
               several small programs that we have attempted. We present
               some examples and discuss the primary implementation issues.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/396/CSL-TR-89-396.pdf

%R CSL-TR-89-398
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The relative effects of optimization on instruction architecture performance
%A Cuderman, K. J.
%A Flynn, M. J.
%D October 1989
%X The Stanford Architect's Workbench is a simulation platform
               used to evaluate the impact of optimization on the relative
               performance of instruction set architectures. The total
               impact optimization makes on an application is the combined
               interaction of the optimizer, the architecture, and the cache
               configuration. The relative performance of seven
               architectures are compared using a suite of six application
               programs.
               Optimization reduces the number of executed instructions, but
               its effectiveness varies with architecture. Register
               architectures capitalize on temporaries introduced by
               optimization without incurring penalties for moving data.
               Short instructions for register operations reduce the
               instruction bandwidth in addition to reducing the number of
               instructions.
               Reducing the number of executed instructions does not yield a
               reduction in memory traffic. Optimization only slightly
               alters the program working set size. An instruction cache
               quickly masks the effect of optimization. The result is that
               the instruction memory traffic remains almost constant for an
               application.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/398/CSL-TR-89-398.pdf

%R CSL-TR-89-400
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Sparse distributed memory prototype: principles and operation
%A Flynn, Michael J.
%A Kanerva, Pentti
%A Bhadkamkar, Neil
%D December 1989
%X Sparse distributed memory is a generalized random-access
               memory (RAM) for long (e.g., 1,000 bit) binary words. Such
               words can be written into and read from the memory, and they
               can also be used to address the memory. The main attribute of
               the memory is sensitivity to similarity, meaning that a word
               can be read back not only by giving the original write
               address but also by giving one close to it as measured by the
               Hamming distance between addresses.
               Large memories of this kind are expected to have wide use in
               speech recognition and scene analysis, in signal detection
               and verification, and in adaptive control of automated
               equipment---in general, in dealing with real-world
               information in real time. The memory can be realized as a
               simple, massively parallel computer. Digital technology has
               reached a point where building large memories is becoming
               practical. This research project is aimed at resolving major
               design issues that have to be faced in building the
               memories.This report describes the design of a prototype
               memory with 256-bit addresses and from 8K to 128K locations
               for 256-bit words. A key aspect of the design is extensive
               use of dynamic RAM and other standard components.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/400/CSL-TR-89-400.pdf

%R CSL-TR-89-383
%Z Thu, 29 Apr 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Super-Scalar Processor Design
%A Johnson, William M.
%D June 1989
%X A super-scalar processor is one that is capable
               of sustaining an instruction-execution rate of 
               more than one instruction per clock cycle.  Maintaining
               this execution rate is primarily a problem of 
               scheduling processor resources (such as functional
               units) for high utilization.  A number of 
               scheduling algorithms have been published, with wide-ranging 
               claims of performance over the single-instruction
               issue of a scalar processor.  However, a number of
               these claims are based on idealizations or on 
               special-purpose applications.
               This study uses trace-driven simulation to evaluate 
               many different super-scalar hardware organizations.
               Super-scalar performance is limited primarily by
               instruction-fetch inefficiencies caused by both
               branch delays and instruction misalignment.  Because of
               this instruction-fetch limitation, it is not worthwhile to
               explore highly-concurrent execution hardware.  Rather, it
               is more appropriate to explore economical execution
               hardware that more closely matches the instruction
               throughput provided by the instruction fetcher.  This 
               study examines techniques for reducing the instruction-fetch
               inefficiencies and explores the resulting hardware
               organizations.  This study concludes that a super-scalar
               processor can have nearly twice the performance of a 
               scalar processor, but that this requires that four major
               hardware features: out-of-order execution, register
               renaming, branch prediction, and a four-instruction
               decoder.  These features are interdependent, and
               removing any single feature reduces average performance
               by 18% or more.  However, there are many hardware
               simplifications that cause only a small performance
               reduction.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/383/CSL-TR-89-383.pdf

%R CSL-TR-89-397
%Z Wed, 05 May 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design and Clocking of VLSI Multipliers
%A Santoro, Mark Ronald
%D October 1989
%X This thesis presents a versatile new multiplier
               architecture, which can provide better performance
               than conventional linear arry multipliers at a
               fraction of the silicon area.  The high performance
               is obtained by using a new binary tree structure,
               the 4-2 tree.  The 4-2 tree is symmetric and far
               more regular than other multiplier trees while
               offering comparable performance, making it better
               suited for VLSI implementations.  To reduce area, a
               partial, pipelined 4-2 tree is used with a 4-2 
               carry-save accumulator placed at its outputs to
               iteratively sum the partial products as they are
               generated.  Maximum performance is obtained by 
               accurately matching the iterative clock to the
               pipeline rate of the 4-2 tree, using a stoppable 
               on-chip clock generator.
               To prove the new architecture a test chip, called
               SPIM, was fabricated in a 1.6 (Mu)m CMOS process.
               SPIM contains 41,000 transistors with an array size
               of 2.9 X 5.3 mm.  Running at an internal clock
               frequency of 85 MHz, SPIM performs the 64 bit mantissa 
               portion of a double extended precision floating-point
               multiply in under 120 ns.  To make the new architecture
               commercially interesting, several high-performance rounding
               algorithms compatible with IEEE standard 754 for
               binary floating-point arithmetic have also been developed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/89/397/CSL-TR-89-397.pdf

%R CSL-TR-90-410
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Tango introduction and tutorial
%A Goldschmidt, Stephen R.
%A Davis, Helen
%D January 1990
%X Tango is a software-based multiprocessor simulator that can
               generate traces of synchronization events and data
               references. The system runs on a uniprocessor and provides a
               simulated multiprocessor environment. The user code is
               augmented during compilation to produce a compiled simulation
               system with optional logging. Tango offers flexible and
               accurate tracing by allowing the user to incorporate various
               memory and synchronization models. Tango achieves high
               efficiency by running compiled user code, by focusing on
               information that is of specific interest to multiprocessing
               studies and by allowing the user to select the most efficient
               memory simulation that is appropriate for a set of
               experiments.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/410/CSL-TR-90-410.pdf

%R CSL-TR-90-411
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Branch strategies: modeling and optimization
%A Dubey, Pradeep K.
%A Flynn, Michael J.
%D February 1990
%X Instruction dependency introduced by conditional branch
               instructions, which is resolved only at run-time, can have
               severe performance impact on pipelined machines. A variety of
               strategies are in wide use to minimize this impact.
               Additional instruction traffic generated by these branch
               strategies can also have an adverse effect on the system
               performance. Therefore, in addition to the likely reduction a
               branch prediction strategy offers in average branch delay,
               resulting excess i-traffic can be an important parameter in
               evaluating its overall effectiveness. The objective of this
               paper is twofold: to develop a model for different approaches
               to the branch problem and to help select an optimal strategy
               after taking into account the additional i-traffic generated
               by the i-buffering.
               The model presented provides a flexible tool for comparing
               different branch strategies in terms of the reduction it
               offers in average branch delay and also in terms of the
               associated cost of wasted instruction fetches. This
               additional criterion turns out to be a valuable consideration
               in choosing between two almost equally performing strategies.
               More importantly, it provides a better insight into the
               expected overall system performance. Simple
               compiler-support-based low implementation-cost strategies can
               be very effective under certain conditions. An active branch
               prediction scheme based on loop buffer can be as competitive
               as a branch-target-buffer based strategy.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/411/CSL-TR-90-411.pdf

%R CSL-TR-90-413
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An area-utility model for on-chip memories and its application
%A Mulder, Johannes M.
%A Quach, Nhon T.
%A Flynn, Michael J.
%D February 1990
%X Utility can be defined as quality per unit of cost. The
               utility of a particular function in a microprocessor can be
               defined as its contribution to the overall processor
               performance per unit of implementation cost. In the case of
               on-chip data memory (e.g., registers, caches) the performance
               contribution can be reduced to its effectiveness in reducing
               memory traffic or in reducing the average time to fetch
               operands. An important cost measure for on-chip memory is
               occupied area. On-chip memory performance, however, is
               expressed much more easily as a function of size (the storage
               capacity) than as a function of area.
               Simple models have been proposed for mapping memory size to
               occupied area. These models, however, are of unproven
               validity and only apply when comparing relatively large
               buffers (� 128 words for caches, � 32 words for register
               sets) of the same structure (e.g., cache versus cache). In
               this paper we present an area model for on-chip memories. The
               area model considers the supplied bandwidth of the individual
               memory cells and includes such overhead as control logic,
               driver logic, and tag storage, thereby permitting comparison
               of data buffers of different organizations and of arbitrary
               sizes. The model gave less than 10% error when verified
               against real caches and register files.
               Using this area-utility measure  F(Performance,Area), we
               first investigated the performance of various cache
               organizations and then compared the performance of register
               buffers (e.g., register sets, multiple overlapping sets) and
               on-chip caches. Comparing cache performance as a function of
               area, rather than size, leads to a significantly different
               set of organizational tradeoffs. Caches occupy more area per
               bit than register buffers for sizes of 128 words or less. For
               data caches, line size is a primary determinant of
               performance for small sizes while write policy becomes the
               primary factor for larger caches. For the same area, multiple
               register sets have poorer performance than a single register
               set with cache except when the memory access time is very
               fast (under 3 processor cycles).
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/413/CSL-TR-90-413.pdf

%R CSL-TR-90-415
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T High-speed addition in CMOS
%A Quach, Nhon T.
%A Flynn, Michael J.
%D February 1990
%X This paper describes a fully static Complementary Metal-Oxide
               Semiconductor (CMOS) implementation of a Ling type adder.
               The implementation described herein saves up to one gate
               delay and always reduces the number of serial transistors in
               the worst-case (critical) path over the conventional carry
               look-ahead (CLA) approach with a negligible increase in
               hardware.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/415/CSL-TR-90-415.pdf

%R CSL-TR-90-418
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Runtime Access to Type Information in C++
%A Interrante, John A.
%A Linton, Mark A.
%D March 1990
%X The C++ language currently does not provide a mechanism for
               an object to determine its type at runtime. We propose the
               Dossier class as a standard interface for accessing type
               information from within a C++ program. We have implemented a
               tool called mkdossier that automatically generates type
               information in a form that can be compiled and linked with an
               application. In the prototype implementation, a class must
               have a virtual function to access an object's dossier given
               the object. We propose this access be provided implicitly by
               the language through a predefined member in all classes.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/418/CSL-TR-90-418.pdf

%R CSL-TR-90-419
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T HardwareC -- A Language for Hardware Design (Version 2.0)
%A Ku, David
%A DeMicheli, Giovanni
%D April 1990
%X High-level synthesis is the transformation from a behavioral
               level specification of hardware, through a series of
               optimizations and translations, to an implementation in terms
               of logic gates and registers. The success of a high-level
               synthesis system is heavily dependent on how effectively the
               high-level language captures the ideas of the designer in a
               simple and understandable way. Furthermore, as system-level
               issues such as communication protocols and design
               partitioning dominate the design process, the ability to
               specify constraints on the timing requirements and resource
               utilization of a design is necessary to ensure that the
               design can integrate with the rest of the system.
               In this paper, a hardware description language called
               HardwareC is presented. HardwareC supports both declarative
               and procedural semantics, has a C-like syntax, and is
               extended with notion of concurrent processes, message
               passing, timing constraints via tagging, resource
               constraints, explicit instantiation of models, and template
               models. The language is used as the input to the Hercules
               High-level Synthesis System.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/419/CSL-TR-90-419.pdf

%R CSL-TR-90-423
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Implementing a Directory-Based Cache Consistency Protocol
%A Simoni, Richard
%D March 1990
%X Directory-based cache consistency protocols have the
               potential to allow shared-memory multiprocessors to scale to
               a large number of processors. While many variations of these
               coherence schemes exist in the literature, they have
               typically been described at a rather high level, making
               adequate evaluation difficult. This paper explores the
               implementation issues of directory-based coherency strategies
               by developing a design at the level of detail needed to write
               a memory system functional simulator with an accurate timing
               model. The paper presents the design of both an invalidation
               coherency protocol and the associated directory/memory
               hardware. Support is added to prevent deadlock, handle subtle
               consistency situations, and implement a proper programming
               model of multiprocess execution. Extensions are delineated
               for realizing a multiple-threaded directory that can continue
               to process commands while waiting for a reply from a cache.
               The final hardware design is evaluated in the context of the
               number of parts required for implementation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/423/CSL-TR-90-423.pdf

%R CSL-TR-90-425
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Concurrent runtime monitoring of formally specified programs
%A Mandal, Manas
%A Sankar, Sriram
%D April 1990
%X This paper describes an application of formal specifications
               after an executable program has been constructed. We describe
               how high level specifications can be utilized to monitor
               critical aspects of the behavior of a program continuously
               while it is executing. This methodology provides a capability
               to distribute the monitoring of specifications on
               multi-processor hardware platforms to meet practical time
               constraints.
               Typically, runtime checking of formal specifications involves
               a significant time penalty which makes it impractical during
               normal production operation of a program. In previous
               research, runtime checking has been applied during testing
               and debugging of software, but not on a permanent basis.
               Crucial to our current methodology is the use of
               multi-processor machines - hence runtime monitoring can be
               performed concurrently on different processors. We describe
               techniques for distributing checks onto different processors.
               To control the degree of concurrency, we introduce
               checkpoints - a point in the program beyond which execution
               cannot proceed until the specified checks have been
               completed. Error reporting and recovery in a multi-processor
               environment is complicated and there are various techniques
               of handling this. We describe a few of these techniques in
               this paper.
               An implementation of this methodology for the Anna
               specification language for Ada programs is described. Results
               of experiments conducted on this implementation using a 12
               processor Sequent Symmetry demonstrate that permanent
               concurrent monitoring of programs based on formal
               specifications is indeed feasible.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/425/CSL-TR-90-425.pdf

%R CSL-TR-90-426
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A VLSI architecture for the FCHC isometric lattice gas model
%A Lee, Fung F.
%A Flynn, Michael J.
%A Morf, Martin
%D April 1990
%X Lattice gas models are cellular automata used for the
               simulation of fluid dynamics. This paper addresses the design
               issues of a lattice gas collision rule processor for the
               four-dimensional FCHC isometric lattice gas model. A novel
               VLSI architecture based on an optimized version of Henon's
               isometric algorithm is proposed. One of the key concepts
               behind this architecture is the permutation group
               representation of the isometry group of the lattice. In
               contrast to the straightforward table lookup approach which
               would take 4.5 billion bits to implement this set of
               collision rules, the size of our processor is only about 5000
               gates. With a reasonable number of pipeline stages, the
               processor can deliver one result per cycle with a cycle time
               comparable to or less than that of a common commercial DRAM.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/426/CSL-TR-90-426.pdf

%R CSL-TR-90-428
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Sub-nanosecond arithmetic
%A Flynn, Michael J.
%A DeMicheli, Giovanni
%A Dutton, Robert
%A Wooley, Bruce
%A Pease, R. Fabian
%D May 1990
%X The SNAP (Stanford Nanosecond Arithmetic Processor) project
               is targeted at realizing an arithmetic processor with
               performance approximately an order of magnitude faster than
               currently available technology. The realization of SNAP is
               predicated on an interdisciplinary approach and effort
               spanning research in algorithms, data representation, CAD,
               circuits and devices, and packaging. SNAP is visualized as an
               arithmetic coprocessor implemented on an active substrate
               containing several chips, each of which realize a particular
               arithmetic function.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/428/CSL-TR-90-428.pdf

%R CSL-TR-90-431
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Latency and throughput tradeoffs in self-timed speed-independent 
	pipelines and rings
%A Williams, Ted
%D August 1990
%X Asynchronous pipelines control the flow of tokens through a
               sequence of logical stages based on the status of local
               completion detectors. As in a synchronously clocked circuit,
               the design of self-timed pipelines can trade off between
               achieving low latency and high throughput. However, there are
               more degrees of freedom because of the variances in specific
               latch and function block styles, and the possibility of
               varying both the number of latches between function blocks
               and their connections to the completion detectors. This
               report demonstrates the utility of a graph-based methodology
               for analyzing the timing dependencies and uses it to make
               comparisons of different configurations. It is shown that the
               extremes for high throughput and low latency differ
               significantly, the placement of the completion detectors
               influences timing as much as adding an additional latch, and
               the choice as to whether precharged or static logic is best
               is dependent on the cost in complexity of the completion
               detectors.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/431/CSL-TR-90-431.pdf

%R CSL-TR-90-436
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Application of formal specification to software maintenance
%A Madhav, Neel
%A Sankar, Sriram
%D August 1990
%X This paper describes the use of formal specifications and
               associated tools in addressing various aspects of software
               maintenance ---corrective, perfective, and adaptive. It also
               addresses the refinement of the software development process
               to build programs that are easily maintainable. The task of
               software maintenance in our case includes the task of
               maintaining the specification as well as maintaining the
               program.
               We focus on the use of Anna, a specification language for
               formally specifying Ada programs, to aid us in maintaining
               Ada programs. These techniques are applicable to most other
               specification language and programming language environments.
               The tools of interest are: (1) the Anna Specification
               Analyzer which allows us to analyze the specification for
               correctness with respect to our informal understanding of
               program behavior; and (2) the Anna Consistency Checking
               System which monitors the Ada program at runtime based on the
               Anna specification.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/436/CSL-TR-90-436.pdf

%R CSL-TR-90-438
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Methodology for Formal Specification and Implementation of
               Ada Packages Using Anna
%A Madhav, Neel
%A Mann, Walter
%D August 1990
%X This paper presents a methodology for formal specification
               and prototype implementation of Ada packages using the Anna
               specification language.
               Specifications play an important role in the software
               development cycle. The methodology allows specifiers of Ada
               packages to follow a sequence of simple steps to formally
               specify packages.
               Given the formal specification of a package resulting from
               the methodology for package specifications, the methodology
               allows implementors of packages to follow a few simple steps
               to implement the package. The implementation is meant to be a
               prototype.
               This methodology for specification and implementation is
               applicable to most Ada packages. Limitations of this approach
               are pointed out at various points in the paper.
               We present software tools which help the process of
               specification and implementation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/438/CSL-TR-90-438.pdf

%R CSL-TR-90-439
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Tango: A Multiprocessor Simulation and Tracing System
%A Davis, Helen
%A Goldschmidt, Stephen R.
%D July 1990
%X Tango is a software simulation and tracing system used to
               obtain data for evaluating parallel programs and
               multiprocessor systems. The system provides a simulated
               multiprocessor environment by multiplexing application
               processes onto a single processor. Tango achieves high
               efficiency by running compiled user code, and by focusing on
               the information of greatest interest to multiprocessing
               studies. The system is being applied to a wide range of
               investigations, including algorithm studies and a variety of
               hardware evaluations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/439/CSL-TR-90-439.pdf

%R CSL-TR-90-441
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Computing Types During Program Specialization
%A Weise, Daniel
%A Ruf, Erik
%D October 1990
%X We have developed techniques for obtaining and using type
               information during program specialization (partial
               evaluation). Computed along with every residual expression
               and every specialized program is type information that bounds
               the possible values that the specialized program will compute
               at run time. The three keystones of this research are
               symbolic values that represent both a value and the code for
               creating the value, generalization of symbolic values, and
               the use of online fixed-point iterations for computing the
               type of values returned by specialized recursive functions.
               The specializer exploits type information to increase the
               efficiency of specialized functions. This research has two
               benefits, one anticipated and one unanticipated. The
               anticipated benefit is that programs that are to be
               specialized can now be written in a more natural style
               without losing accuracy during specialization. The
               unanticipated benefit is the creation of what we term
               concrete abstract interpretation. This is a method of
               performing abstract interpretation with concrete values where
               possible. The specializer abstracts values as needed, instead
               of requiring that all values be abstracted prior to abstract
               interpretation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/441/CSL-TR-90-441.pdf

%R CSL-TR-90-442
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An improved algorithm for high-speed floating-point addition
%A Quach, Nhon T.
%A Flynn, Michael J.
%D August 1990
%X This paper describes an improved, IEEE conforming
               floating-point addition algorithm. This algorithm has only
               one addition step involving the significand in the
               worst-case path, hence offering a considerable speed
               advantage over the existing algorithms, which typically
               require two to three addition steps.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/442/CSL-TR-90-442.pdf

%R CSL-TR-90-443
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A queuing analysis for disk array systems
%A Ogata, Mikito
%A Flynn, Michael J.
%D August 1990
%X Using a queuing model of disk arrays, we study the
               performance and tradeoffs in disk array sub-systems and
               develop guidelines for designing these sub-systems in various
               CPU environments. Finally, we compare our model with some
               earlier simulation results.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/443/CSL-TR-90-443.pdf

%R CSL-TR-90-453
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Event patterns: A language construct for hierarchical designs of concurrent systems
%A Luckham, David D.
%A Gennart, Benoit A.
%D November 1990
%X Event patterns are a language construct for expressing
               relationships between specifications at different levels of a
               hierarchical design of a concurrent system. They provide a
               facility missing from current hardware design languages such
               as VHDL, or programming languages with parallel constructs
               such as Ada.
               This paper explains the use of event patterns in (1) defining
               mappings between different levels of a design hierarchy, and
               (2) automating the comparison of the behavior of different
               design levels during simulation. It describes the language
               constructs for defining event patterns and mappings, and
               shows their use in a design example, a 16-bit CPU.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/453/CSL-TR-90-453.pdf

%R CSL-TR-90-454
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Page allocation to reduce access time of physical caches
%A Bray, Brian K.
%A Lunch, William L.
%A Flynn, Michael J.
%D November 1990
%X A simple modification to an operating system's page
               allocation algorithm can give physically addressed caches the
               speed of virtually addressed caches. Colored page allocation
               reduces the number of bits that need to be translated before
               cache access, allowing large low-associativity caches to be
               indexed before address translation, which reduces the latency
               to the processor. The colored allocation also has other
               benefits: caches miss less (in general) and more uniformly,
               and the inclusion principle holds for second level caches
               with less associativity. However, the colored allocation
               requires main memory partitioning, and more common bits for
               shared virtual addresses. Simulation results show high
               non-uniformity of cache miss rates for normal allocation.
               Analysis demonstrates the extent of second-level cache
               inclusion, and the reduction in effective main-memory due to
               partitioning.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/90/454/CSL-TR-90-454.pdf

%R CSL-TR-91-459
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T On fast IEEE rounding
%A Quach, Nhon
%A Takagi, Naofumi
%A Flynn, Michael J.
%D January 1991
%X A systematic general rounding procedure is proposed. This
               procedure consists of 2 steps: constructing a rounding table
               and selecting a prediction scheme. Optimization guidelines
               are given in each step to minimize the hardware used. This
               procedure-based rounding method has the additional advantage
               that verification and generalization are trivial. Two
               rounding hardware models are described. The first is shown to
               be identical to that reported by Santoro, et al. The second
               is more powerful, providing solutions where the first fails.
               Applying this approach to the IEEE rounding modes for
               high-speed conventional binary multipliers reveals that round
               to infinity is more difficult to implement than the round to
               nearest mode; more adders are potentially needed. Round to
               zero requires the least amount of hardware.
               A generalization of this procedure to redundant binary
               multipliers reveals two major advantages over conventional
               binary multipliers. First, the computation of the sticky bit
               consumes considerably less hardware. Second, implementing round 
               to positive and minus infinity modes does not require the
               examination of the sticky bit, removing a possible worst-case
               path.
               A generalization of this approach to addition produces a
               similar solution to that reported by Quach and Flynn.
               Although generalizable to other kinds of rounding as well as
               other arithmetic operations, we only treat the case of IEEE
               rounding for addition and multiplication; IEEE rounding
               because it is the current standard on rounding, addition and
               multiplication because they are the most frequently used
               arithmetic operations in a typical scientific computation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/459/CSL-TR-91-459.pdf

%R CSL-TR-91-463
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Leading One Detection --- Implementation, Generalization, and
               Application
%A Quach, Nhon
%A Flynn, Michael J.
%D March 1991
%X This paper presents the concept of leading-one prediction
               (LOP) in greater detail and describes two existing
               implementations. The first one is similar to that used in the
               IBM RS/6000 processor. The second is a distributed version of
               the first, consuming less hardware when multiple patterns
               need to be detected. We show how to modify these circuits for
               sign-magnitude numbers as dictated by the IEEE standard.
               We then point out that (1) LOP and carry lookahead in
               parallel addition belong to the same class of problem, that
               of a bit pattern detection. Such a recognition allows
               techniques developed for parallel addition to be borrowed for
               bit pattern detection. And (2) LOP can be applied to compute
               the sticky bit needed for binary multipliers to perform IEEE
               rounding.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/463/CSL-TR-91-463.pdf

%R CSL-TR-91-468
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Efficient moment-based timing analysis for variable accuracy switch level simulation
%A Kao, Russell
%A Horowitz, Mark
%D April 1991
%X We describe a timing analysis algorithm which can achieve the
               efficiency of RC tree analysis while retaining much of the
               generality of Asymptotic Waveform Estimation. RC tree
               analysis from switch level simulation is generalized to
               handle piecewise linear transistor models, non tree
               topologies, floating capacitors, and feedback. For simple
               switch level models the complexity is O(n). The algorithm
               allows the user to trade off efficiency vs accuracy through
               the selection of transistor models of varying complexity.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/468/CSL-TR-91-468.pdf

%R CSL-TR-91-469
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T SPLASH: Stanford parallel applications for shared-memory
%A Singh, Jaswinder Pal
%A Weber, Wolf-Dietrich
%A Gupta, Anoop
%D April 1991
%X This report was replaced and updated in CSL-TR-92-526
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/469/CSL-TR-91-469.pdf

%R CSL-TR-91-470
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Writes caches as an alternative to write buffers
%A Bray, Brian K.
%A Flynn, Michael J.
%D April 1991
%X Write buffers help unbind one level of a memory hierarchy
               from the next, thus write buffers are used to reduce write
               stalls. Write buffers are used in write-through systems so
               that writes can occur at the rate the cache can handle them,
               but write buffers don't reduce the number of writes, or
               cluster writes for block transfers. A write cache is a cache
               that uses an allocate on write miss, write-back, no allocate
               on read miss strategy. A write cache tries to reduce the
               total number of writes (write traffic) to the next level by
               taking advantage of the temporal locality of writes. A write
               cache also groups writes for block transfers by taking
               advantage of the spatial locality of writes. We have found
               that small write caches can significantly reduce the write
               traffic to the first write-back level after the processor's
               register set. Systems that would benefit from reduced write
               traffic to the first write-back level would benefit from
               using a write cache instead of a write buffer. The temporal
               and spatial locality of writes is very important in
               determining what organization the write cache should have.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/470/CSL-TR-91-470.pdf

%R CSL-TR-91-475
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Making effective use of shared-memory multiprocessors: the process control approach
%A Gupta, Anoop
%A Tucker, Andrew
%A Stevens, Luis
%D May 1991
%X We present the design, implementation, and performance of a
               novel approach for effectively utilizing shared-memory
               multiprocessors in the presence of multiprogramming. Our
               approach offers high performance by combining the techniques
               of process control and processor partitioning. The process
               control technique is based on the principle that to maximize
               performance, a parallel application must dynamically match
               the number of runnable processes associated with it to the
               effective number of processors available to it. This avoids
               the problems arising from oblivious preemption of processes
               and it allows an application to work at a better operating
               point on its speedup versus processors curve. Processor
               partitioning is necessary for dealing with realistic
               multiprogramming environments, where both process controlled
               and non-controlled applications may be present. It also helps
               improve the cache performance of applications and removes the
               bottleneck associated with a single centralized scheduler.
               Preliminary results from an implementation of the process
               control approach, with a user-level server, were presented in
               a previous paper. In this paper, we extend the process
               control approach to work with processor partitioning and
               fully integrate the approach with the operating system
               kernel. This also allows us to address a limitation in our
               earlier�implementation wherein a close correspondence between
               runnable processes and the available processors was not
               maintained in the presence of I/O. The paper presents the
               design decisions and the rationale for the current
               implementation, along with extensive results from executions
               on a high-performance Silicon Graphics 4D/340
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/475/CSL-TR-91-475.pdf

%R CSL-TR-91-480
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Strategies for branch target buffers
%A Bray, Brian K.
%A Flynn, M. J.
%D June 1991
%X Achieving high instruction issue rates depends on the ability
               to dynamically predict branches. We compare two schemes for
               dynamic branch prediction: a separate branch target buffer
               and an instruction cache based branch target buffer. For
               instruction caches of 4KB and greater, instruction cache
               based branch prediction performance is a strong function of
               line size, and a weak function of instruction cache size. An
               instruction cache based branch target buffer with a line size
               of 8 (or 4) instructions performs about as well as a separate
               branch target buffer structure which has 64 (or 256,
               respectively) entries. Software can rearrange basic blocks in
               a procedure to reduce the number of taken branches, thus
               reducing the amount of branch prediction hardware needed.
               With software assistance, predicting all branches as not
               branching performs as well as a 4 entry branch target buffer
               without assistance, and a 4 entry branch target buffer with
               assistance performs as well as a 32 entry branch target
               buffer without assistance. The instruction cache based branch
               target buffer also benefits from the software, but only for
               line sizes of more than 4 instructions.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/480/CSL-TR-91-480.pdf

%R CSL-TR-91-481
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Subnanosecond arithmetic (Second Report)
%A Flynn, Michael J.
%A DeMicheli, Giovanni
%A Dutton, Robert
%A Pease, R. Fabian
%A Wooley, Bruce
%D June 1991
%X The Stanford Nanosecond Arithmetic Project is targeted at
               realizing an arithmetic processor with performance
               approximately an order of magnitude faster than currently
               available technology. The realization of SNAP is predicated
               on an interdisciplinary approach and effort spanning research
               in algorithms, data representation, CAD, circuits and
               devices, and packaging. SNAP is visualized as an arithmetic
               coprocessor implemented on an active substrate containing
               several chips, each of which realize a particular arithmetic
               function. This year's report highlights recent results in the
               area of wave pipelining. We have fabricated a number of
               prototype die, implementing a multiplier slice. Cycle times
               below 5ns were realized.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/481/CSL-TR-91-481.pdf

%R CSL-TR-91-483
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Suggestions for implementing a fast IEEE multiply-add-fused instruction
%A Quach, Nhon
%A Flynn, Michael
%D July 1991
%X We studied three possible strategies to overlap the
               operations in a floating-point add (FPA) and a floating-point
               multiply (FPM) for implementing an IEEE multiply-add-fused
               (MAF) instruction. The operations in FPM and FPA are: (a)
               non-overlapped, (b) fully-overlapped, and (c)
               partially-overlapped. The first strategy corresponds to
               multiply-add-chained (MAC) widely used in vector processors.
               The second (Greedy) strategy uses a greedy algorithm,
               yielding an implementation similar to the IBM RS/6000 one.
               The third and final (SNAP) strategy uses a less aggressive
               starting configuration and corresponds to the SNAP
               implementation. An IEEE MAF delivers the same result as that
               obtained via a separate IEEE FPM and FPA. Two observations
               have prompted this study. First, in the IBM RS/6000
               implementation, the design tradeoffs have been made for high
               internal data precision, which facilitates the execution of
               elementary functions. These tradeoff decisions, however, may
               not be valid for an IEEE MAF. Second, the RS/6000
               implementation assumed a different critical path for FPA and
               FPM, which does not reflect the current state-of-the-art in
               FP technology. Using latency and hardware costs as the
               performance metrics we show that: (1) MAC has the lowest FPA
               latency and consumes the least hardware. But its MAF latency
               is the highest. (2) Greedy has a medium MAF latency but the
               highest FPA latency. And (3) SNAP has the lowest MAF latency
               and a slightly higher FPA latency than that of MAC, consuming
               an area that is comparable with that of Greedy. Both Greedy
               and SNAP have higher design complexity arising from rounding
               for the IEEE standard. SNAP has an additional wire
               complexity, which Greedy does not have because of its simpler
               datapath. If rounding for the IEEE standard is not an issue,
               the Greedy strategy --- and therefore the RS/6000 --- seems
               reasonable for applications with a high MAF to FPA ratio.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/483/CSL-TR-91-483.pdf

%R CSL-TR-91-485
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Unidraw-Based User Interface Builder
%A Vlissides, John M.
%A Tang, Steven
%D August 1991
%X Ibuild is a user interface builder that lets a user
               manipulate simulations of toolkit objects rather than actual
               toolkit objects. Ibuild is built with Unidraw, a framework
               for building graphical editors that is part of the InterViews
               toolkit. Unidraw makes the simulation-based approach
               attractive. Simulating toolkit objects in Unidraw makes it
               easier to support editing facilities that are common in other
               kinds of graphical editors, and it keeps the builder
               insulated from a particular toolkit implementation. Ibuild
               supports direct manipulation analogs of InterViews'
               composition mechanisms, which simplify the specification of
               an interface's layout and resize semantics. Ibuild also
               leverages the C++ inheritance mechanism to decouple
               builder-generated code from the rest of the application. And
               while current user interface builders stop at the widget
               level, ibuild incorporates Unidraw abstractions to simplify
               the implementation of graphical editors.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/485/CSL-TR-91-485.pdf

%R CSL-TR-91-488
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Stanford ADA style checker: an application of the ANNA tools and methodology
%A Walicki, Michal
%A Skakkebaek, Jens Ulrik
%A Sankar, Sriram
%D August 1991
%X This report describes the Ada style checker, which was
               designed and constructed in Winter and Spring 1989-90. The
               style checker is based on the Stanford Anna Tools and has
               been annotated using Anna. The style checker examines Ada
               programs for "correct style'' which is defined in a style
               specification language (SSL). A style checker generator is
               used to automatically generate a style checker based on a set
               of style specifications.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/488/CSL-TR-91-488.pdf

%R CSL-TR-91-492
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Paging Performance with Page Coloring.
%A Lynch, William L.
%A Flynn, Michael J.
%D October 1991
%X Constraining the mapping of virtual to physical addresses
               page coloring can speed and/or simplify caches in the
               presence of virtual memory. For the mapping to hold, physical
               memory must be partitioned into distinct colors, and virtual
               pages allocated to a specific color of physical page
               determined by the mapping. This paper uses and analytical
               model and simulation to compare the paging effects of colored
               versus uncolored (conventional) page allocation, and
               concludes that these effects are small.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/492/CSL-TR-91-492.pdf

%R CSL-TR-91-496
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T ANNA package specification: case studies
%A Kenney, John
%A Mann, Walter
%D October 1991
%X We present techniques of software specification of Ada*
               software based on the Anna specification language and
               examples of Ada packages formally specified in Anna. A
               package specification for an abstract set type is used to
               illustrate the techniques and pitfalls involved in the
               process of software specification and development.
               This specification not only exemplifies good Anna style and
               specification approach, but has a secondary goal of teaching
               the reader how to use Anna and the associated set of Anna
               tools developed at Stanford University over the past six
               years. The technical report thus aims to give readers a new
               way of looking at the software design and development
               process, synthesizing fifteen years of research in the
               process.
               *Ada is a registered trademark of the U.S. Government (Ada
               Joint Program Office)
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/496/CSL-TR-91-496.pdf

%R CSL-TR-91-498
%Z Wed, 30 Mar 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory.
%T Spectral Techniques for Technology Mapping
%A Yang, Jerry Chih-Yuan
%A DeMicheli, Giovanni
%D March 1994
%X Technology mapping is the crucial step in logic synthesis
               where technology dependent optimizations take place. The
               matching phase of a technology mapping algorithm is
               generally considered the most computationally intensive
               task, because it is called on repeatedly. In this work, we
               investigate applications of spectral techniques in doing
               matching. In particular, we present an algorithm that will
               detect NPN-equivalent Boolean functions. We show that while
               generating the spectra for Boolean functions may be
               expensive, this algorithm offers significant pruning of the
               search space and is simple to implement. The algorithm is
               implemented as part of the Specter technology mapper, and
               results are compared to other Boolean matching techniques.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/498/CSL-TR-91-498.pdf

%R CSL-TR-91-484
%Z Mon, 26 Jan 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Self-Consistency and Transitivity in Self-Calibration Procedures
%A Raugh, Michael
%D July 1991 
%X Self-calibration refers to the use of an uncalibrated measuring
               instrument and an uncalibrated object called an artifact, such
               as a rigid marked plate, to simultaneously measure the artifact
               and calibrate the instrument.  Typically, the artifact is
               measured in more than one position, and the required information
               is derived from comparisons of the various measurements.  The
               problems of self-calibration are surprisingly subtle.  This paper  
               develops concepts and vocabulary for dealing with such problems
               in one and two dimensions and uses simple (non-optimal) 
               measurement 
               procedures to reveal the underlying principles.  The approach
               in two dimensions is mathematically constructive: procedures are 
               described for measuring an uncalibrated artifact in several
               stages, involving progressive transformations of the instrument's
               uncalibrated coordinate system, until correct coordinates for 
               the artifact are obtained and calibration of the instrument is
               achieved.  Self-consistency and transitivity, as defined within, 
               emerge as key concepts.  It is shown that self-consistency and
               transitivity are necessary conditions for self-calibration.  
               Consequently, in general, it is impossible to calibrate a 
               two dimenstional measuring instrument by simply rotating and
               measureing a calibration plate about a fixed center. 
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/484/CSL-TR-91-484.pdf

%R CSL-TR-91-465
%Z Mon, 28 Dec 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Analysis of Power Supply Networks in VLSI Circuits
%A Stark, Don
%D March 1991
%X Although the trend toward finer geometries and larger
               chips has produced faster systems, it has also created 
               larger voltage drops and higher current densities in
               chip power supply networks.  Excessive voltage drops
               in the power supply lines cause incorrect circuit 
               operation, and high current densities lead to circuit 
               failure via electromigration.  Analyzing this power
               supply noise by hand for large circuits is difficult
               and error prone; automatic checking tools are needed
               to make the analysis easier.
               This thesis describes Ariel, a CAD tool that helps
               VLSI designers analyze power supply noise.  The system
               consists of three main components, a resistance extractor,
               a current estimator, and a linear solver, that are used
               together to determine the voltage drops and current
               density along the supply lines.  The resistance extractor
               includes two parts: a fast extractor that calculates
               resistances quickly using simple heuristics, and a slower,
               more accurate finite element extractor.  Despite its
               simplicity, the fast extractor obtained nearly the same
               results as the finite element one and is two orders of
               magnitude faster.  The system also contains two current
               estimators, one for CMOS designs and one for ECL.  The
               CMOS current estimator is based on the switch level 
               simulator Rsim, and produces a time-varying current
               distribution that includes the effects of charge sharing,
               image currents, and slope on the gate's inputs.  The ECL,
               estimator does a static analysis of the design, calculating
               each gate's tail current and tracing through the network
               to find where it enters the power supplies.  Extensions
               to the estimator allow it to handle more complex circuits,
               such as shared current lines and diode decoders.  Finally,
               the linear solver applies this current pattern to the
               resistance network, and efficiently calculates voltages
               and current densities by taking advantage of topological
               characteristics peculiar to power supply networks.  It 
               removes trees, simple loops, and series sections for 
               separate analysis.  These techniques substantially reduce
               the time required for solution.
               This report also includes the results of running the
               system on several large designs, and points out flaws
               that Ariel uncovered in their power networks.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/91/465/CSL-TR-91-465.pdf

%R CSL-TR-92-510
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Rapide-0.2 Examples
%A Hsieh, Alexander
%D February 1992
%X Rapide-0.2 is an executable language for prototyping
               distributed, time sensitive systems. We present in this
               report a series of simple, working example programs in the
               language.
               In each example we present one or more new concepts or
               constructs of the Rapide-0.2 language with later examples
               drawing on previously presented material.
               The examples are written for both those who wish to use the
               Rapide-0.2 language to do serious prototyping and for those
               who just wish to be familiar with it. The examples were not
               written for someone who wishes to learn prototyping in
               general.
               CSL-TN-92-387 is an informal reference manual, describing the
               Rapide-0.2 language and tools, which might be helpful to have in
               conjunction with CSL-TR-92-510 (p. 191).
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/510/CSL-TR-92-510.pdf

%R CSL-TR-92-515
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Partial orderings of event sets and their application to prototyping concurrent timed systems
%A Luckham, David C.
%A Vera, James
%A Bryan, Doug
%A Augustin, Larry
%A Belz, Frank
%D April 1992
%X Rapide is a concurrent object-oriented language specifically
               designed for prototyping large concurrent systems. One of the
               principle design goals has been to adopt a computation model
               in which the synchronization, concurrency, dataflow, and
               timing aspects of a prototype are explicitly represented and
               easily accessible both to the prototype itself and to the
               prototyper. This paper describes the partially ordered event
               set (poset) computation model, and the features of Rapide for
               using posets in reactive prototypes and for automatically
               checking posets. Some critical issues in the implementation
               of Rapide are described and our experience with them is
               summarized. An example prototyping scenario illustrates uses
               of the poset computation model.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/515/CSL-TR-92-515.pdf

%R CSL-TR-92-516
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Opportunities for Online Partial Evaluation
%A Ruf, Erik
%A Weise, Daniel
%D April 1992
%X Partial evaluators can be separated into two classes: offline
               specializers, which make all of their reduce/residualize
               decisions before specialization, and online specializers,
               which make such decisions during specialization. The choice
               of which method to use is driven by a tradeoff between the
               efficiency of the specializer and the quality of the residual
               programs that it produces. Existing research describes some
               of the inefficiencies of online specializers, and how these
               are avoided using offline methods, but fails to address the
               price paid in specialization quality. This paper motivates
               research in online specialization by describing two
               fundamental limitations of the offline approach, and explains
               why the online approach does not encounter the same
               difficulties.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/516/CSL-TR-92-516.pdf

%R CSL-TR-92-517
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Preserving Information during Online Partial Evaluation
%A Ruf, Erik
%A Weise, Daniel
%D April 1992
%X The degree to which a partial evaluator can specialize a
               source program depends on how accurately the partial
               evaluator can represent and maintain information about
               runtime values. Partial evaluators always lose some accuracy
               due to their use of finite type systems; however, existing
               partial evaluation techniques lose information about runtime
               values even when their type systems are capable of
               representing such information. This paper describes two
               sources of such loss in existing specializers, solutions for
               both cases, and the implementation of these solutions in our
               partial evaluation system, FUSE.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/517/CSL-TR-92-517.pdf

%R CSL-TR-92-518
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Avoiding Redundant Specialization during Partial Evaluation
%A Ruf, Erik
%A Weise, Daniel
%D April 1992
%X Existing partial evaluators use a strategy called polyvariant
               specialization, which involves specializing program points on
               the known portions of their arguments, and re-using such
               specializations only when these known portions match exactly.
               We show that this re-use criterion is overly restrictive, and
               misses opportunities for sharing in residual programs, thus
               producing large residual programs containing redundant
               specializations. We develop a criterion for re-use based on
               computing the domains of specializations, describe an
               approximate implementation of this criterion based on types,
               and show its implementation in our partial evaluation system
               FUSE. In addition, we describe several extensions to our
               mechanism to make it compatible with more powerful
               specialization strategies and to increase its efficiency.
               After evaluating our algorithm's usefulness, we relate it to
               existing work in partial evaluation and machine learning.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/518/CSL-TR-92-518.pdf

%R CSL-TR-92-520
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An Empirical Study of an Abstract Interpretation of Scheme
               Programs
%A Kanamori, Atty
%A Weise, Daniel
%D April 1992
%X Abstract Interpretation, a powerful and general framework for
               performing global program analysis, is being applied to
               problems whose difficulty far surpasses the traditional
               "bit-vector'' dataflow problems for which many of the
               high-speed abstract interpretation algorithms worked so well.
               Our experience has been that current methods of large scale
               abstract interpretation are unacceptably expensive.
               We studied a typical large-scale abstract interpretation
               problem: computing the control flow of a higher order
               program. Researchers have proposed various solutions that are
               designed primarily to improve the accuracy of the analysis.
               The cost of the analyses, and its relationship to accuracy,
               is addressed only cursorily in the literature. Somewhat
               paradoxically, one can view these strategies as attempts to
               simultaneously improve the accuracy and reduce the cost. The
               less accurate strategies explore many spurious control paths
               because many flowgraph paths represent illegal execution
               paths. For example, the less accurate strategies violate the
               LIFO constraints on procedure call and return. More accurate
               analyses investigate fewer control paths, and therefore may
               be more efficient despite their increased overhead.
               We empirically studied this accuracy versus efficiency
               tradeoff. We implemented two fixpoint algorithms, and four
               semantics (baseline, baseline + stack reasoning, baseline +
               contour reasoning, baseline + stack reasoning + contour
               reasoning) for a total of eight control flow analyzers. Our
               benchmarks test various programming constructs in isolation
               --- hence, if a certain algorithm exhibits poor performance,
               the experiment also yields insight into what kind of program
               behavior results in that poor performance. The results
               suggest that strategies that increase accuracy in order to
               eliminate spurious paths often generate unacceptable overhead
               in the parts of the analysis that do not benefit from the
               increased accuracy. Furthermore, we found little evidence
               that the extra effort significantly improves the accuracy of
               the final result. This suggests that increasing the accuracy
               of the analysis globally is not a good idea, and that future
               research should�investigate adaptive algorithms that use
               different amounts of precision on different parts of the
               problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/520/CSL-TR-92-520.pdf

%R CSL-TR-92-523
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Architectural and implementation tradeoffs in the design of multiple-context processors
%A Laudon, James
%A Gupta, Anoop
%A Horowitz, Mark
%D May 1992
%X Multiple-context processors have been proposed as an
               architectural technique to mitigate the effects of large
               memory latency in multiprocessors. We examine two schemes for
               implementing multiple-context processors. The first scheme
               switches between contexts only on a cache miss, while the
               other interleaves the contexts on a cycle-by-cycle basis.
               Both schemes provide the capability for a single context to
               fully utilize the pipeline. We show that cycle-by-cycle
               interleaving of contexts provides a performance advantage
               over switching contexts only at a cache miss. This advantage
               results from the context interleaving hiding pipeline
               dependencies and reducing the context switch cost. In
               addition, we show that while the implementation of the
               interleaved scheme is more complex, the complexity is not
               overwhelming. As pipelines get deeper and operate at lower
               percentages of peak performance, the performance advantage of
               the interleaved scheme is likely to justify its additional
               complexity.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/523/CSL-TR-92-523.pdf

%R CSL-TR-92-526
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T SPLASH: Stanford parallel applications for shared-memory*
%A Singh, Jaswinder Pal
%A Weber, Wolf-Dietrich
%A Gupta, Anoop
%D June 1992
%X We present the Stanford Parallel Applications for
               Shared-Memory (SPLASH), a set of parallel applications for
               use in the design and evaluation of shared-memory
               multiprocessing systems. Our goal is to provide a suite of
               realistic applications that will serve as a well-documented
               and consistent basis for evaluation studies. We describe the
               applications currently in the suite in detail, discuss and
               compare some of their important characteristicsPsuch as data
               locality, granularity, synchronization, etc.Pand explore
               their behavior by running them on a real multiprocessor as
               well as on a simulator of an idealized parallel architecture.
               We expect the current set of applications to act as a nucleus
               for a suite that will grow with time. This report replaces and 
               updates CSL-TR-91-469, April 1991.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/526/CSL-TR-92-526.pdf

%R CSL-TR-92-528
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Binary multiplication Using Partially Redundant Multiples
%A Bewick, Gary
%A Flynn, Michael J.
%D June 1992
%X This report presents an extension to Booth's algorithm for
               binary multiplication. Most implementations that utilize
               Booth's algorithm use the 2 bit version, which reduces the
               number of partial products required to half that required by
               a simple add and shift method. Further reduction in the
               number of partial products can be obtained by using higher
               order versions of Booth's algorithm, but it is necessary to
               generate multiples of one of the operands (such as 3 times an
               operand) by the use of a carry propagate adder. This carry
               propagate addition introduces significant delay and
               additional hardware. The algorithm described in this report
               produces such difficult multiples in a partially redundant
               form, using a series of small length adders. These adders
               operate in parallel with no carries propagating between them.
               As a result, the delay introduced by multiple generation is
               minimized and the hardware needed for the multiple generation
               is also reduced, due to the elimination of expensive carry
               lookahead logic.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/528/CSL-TR-92-528.pdf

%R CSL-TR-92-534
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T On the specialization of online program specializers
%A Ruf, Erik
%A Weise, Daniel
%D July 1992
%X Program specializers improve the speed of programs by
               performing some of the programs' reductions at specialization
               time rather than at runtime. This specialization process can
               be time-consuming; one common technique for improving the
               speed of the specialization of a particular program is to
               specialize the specializer itself on that program, creating a
               custom specializer, or program generator, for that particular
               program.
               Much research has been devoted to the problem of generating
               efficient program generators, which do not perform reductions
               at program generation time which could instead have been
               performed when the program generator was constructed. The 
               conventional wisdom holds that only offline program
               specializers, which use binding time annotations, can be
               specialized into such efficient program generators. This
               paper argues that this is not the case, and demonstrates that
               the specialization of a nontrivial online program specializer
               similar to the original "naive MIX" can indeed yield an
               efficient program generator.
               The key to our argument is that, while the use of binding
               time information at program generator generation time is
               necessary for the construction of an efficient custom
               specializer, the use of explicit binding time approximation
               techniques is not. This allows us to distinguish the problem
               at hand (i.e., the use of binding time information during
               program generator generation) from particular solutions to
               that problem (i.e., offline specialization). We show that,
               given a careful choice of specializer data structures, and
               sufficiently powerful specialization techniques, binding time
               information can be inferred and utilized without the use of
               explicit binding time approximation techniques. This allows
               the construction of efficient, optimizing program generators
               from online program specializers.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/534/CSL-TR-92-534.pdf

%R CSL-TR-92-546
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The accuracy of trace-driven simulations of multiprocessors
%A Goldschmidt, Stephen R.
%A Hennessy, John L.
%D September 1992
%X In trace-driven simulation, traces generated for one set of
               machine characteristics are used to simulate a machine with
               different characteristics. However, the execution path of a
               multiprocessor workload may depend on the ordering of events
               on different processors, which in turn depends on machine
               characteristics such as memory system timings. Trace-driven
               simulations of multiprocessor workloads are inaccurate unless
               the timing-dependencies are eliminated from the traces. We
               measure such inaccuracies by comparing trace-driven
               simulations to direct simulations of the same workloads. The
               results were identical only for workloads whose timing
               dependencies were eliminated from the traces. The remaining
               workloads used either first-come first-served scheduling or
               non-deterministic algorithms; these characteristics resulted
               in timing-dependencies that could not be eliminated from the
               traces. Workloads which used task-queue scheduling had
               particularly large discrepancies because task-queue
               operations, unlike other synchronization operations, were not
               abstracted. Two types of simulation results had especially
               large discrepancies: those related to synchronization latency
               and those derived from relatively small numbers of events.
               Studies that rely on such results should use timing-
               independent traces or direct simulation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/546/CSL-TR-92-546.pdf

%R CSL-TR-92-553
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Branch predication using large self history
%A Johnson, John D.
%D December 1992
%X Branch prediction is the main method of providing speculative
               opportunities for new high performance processors, therefore
               the accuracy of branch prediction is becoming very important.
               Motivated by this desire to achieve high levels of branch
               prediction, this study examines methods of using up to 24
               bits branch direction history to determine the probable
               outcome of the next execution of a conditional branch. Using
               profiling to train a prediction logic function achieves an
               average branch prediction accuracy of up to 96.9% for the six
               benchmarks used in this study.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/553/CSL-TR-92-553.pdf

%R CSL-TR-92-548
%Z Tue, 05 Dec 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T System synthesis via hardware-software co-design
%A Gupta, Rajesh K.
%A DeMicheli, Giovanni
%D October 1992
%X Synthesis of circuits containing application-specific as well
               as re-programmable components such as off-the-shelf
               microprocessors provides a promising approach to realization
               of complex systems using a minimal amount of
               application-specific hardware while still meeting the
               required performance constraints. We formulate the synthesis
               problem of complex behavioral descriptions with performance
               constraints as a hardware-software co-design problem. The
               target system architecture consists of a software component
               as a program running on a re-programmable processor assisted
               by application-specific hardware components. System synthesis
               is performed by first partitioning the input system
               description into hardware and software portions and then by
               implementing each of them separately. We consider the problem
               of identifying potential hardware and software components of
               a system described in a high-level modeling language.
               Partitioning approaches are presented based on decoupling of
               data and control flow, and based on
               communication/synchronization requirements of the resulting
               system design.
               Synchronization between various elements of a mixed
               system design is one of the key issues that any synthesis
               system must address. We present software and interface
               synchronization schemes that facilitate communication between
               system components. We explore the relationship between the
               non-determinism in the system models and the associated
               synchronization schemes needed in system implementations.
               The synthesis of dedicated hardware is achieved by hardware
               synthesis tools, while the software component is generated
               using software compiling techniques. We present tools to
               perform synthesis of a system description into hardware and
               software components. The resulting software component is
               assumed to be implemented for the DLX machine, a load/store
               microprocessor. We present design of an ethernet based
               network coprocessor to demonstrate the feasibility of mixed
               system synthesis.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/548/CSL-TR-92-548.pdf

%R CSL-TR-92-550
%Z Wed, 23 Dec 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Cache Coherence Directories for Scalable Multiprocessors
%A Simoni, Richard
%D October 1992
%X Directory-based protocols have been proposed as an
               efficient means of implementing cache coherence in
               large-scale shared-memory multiprocessors.  This
               thesis explores the trade-offs in the design of cache 
               coherence directories by examining the organization
               of the directory information, the options in the
               design of the coherency protocol, and the implementation
               of the directory and protocol.
               The traditional directory organization that maintains a full
               valid bit vector per directory entry is unsuitable for
               large-scale machines due to high storage overhead.  This
               thesis proposes several alternate organizations.  Limited 
               pointers directories replace the bit vactor with several 
               pointers that indicate those caches containing the data.
               Although this scheme performs well across a wide range of
               workloads, its performance does not improve as the read/write
               ratio becomes very large.  To address this drawback, a
               dynamic pointer allocation directory is proposed.  This 
               directory allocates pointers from a pool to particular memory
               blocks as they are needed.  Since the pointers may be
               allocated to any block on the memory module, the probability 
               of running short of pointers is very small.  Among the set of 
               possible organizations, dynamic pointer allocation lies at an
               attractive cost/performance point.
               Measuring the performance impact of three coherency protocol
               features makes the virtues of simplicity clear.  Adding a
               clean/exclusive state to reduce the time required to write a
               clean block results in only modest performance improvement.
               Using request forwarding to transfer a dirty block directly to
               another cache that has requested it yields similar results.
               For small cache block sizes, write hits to clean blocks can
               be simply treated as write misses without incurring significant
               extra network traffic.  Protocol features designed to improve
               performance must be examined carefully, for they often 
               complicate the protocol without offering substantial benefit.
               Implementing directory-based coherency presents several
               challenges.  Methods are described for preventing deadlock, 
               maintaining a model of parallel execution, handling subtle
               situations caused by temporary inconsistencies between cache
               and directory state, and tolerating out-of-order message
               delivery.  Using these techniques, cache coherence can be
               added to large-scale multiprocessors in an inexpensive yet
               effective manner.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/550/CSL-TR-92-550.pdf

%R CSL-TR-92-532
%Z Mon, 28 Dec 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Piecewise Linear Models for Switch-Level Simulation
%A Kao, Russell
%D June 1992
%X Rsim is an efficient logic plus timing simulator that employs 
               the switched resistor transistor model and RC tree analysis
               to simulate efficiently MOS digital circuits at the
               transistor level.  We investigate the incorporation of 
               piecewise linear transistor models and generalized moments
               matching into this simulation framework.  General piecewise
               linear models allow more accurate MOS models to be used to 
               simulate circuits that are hard for Rsim.  Additionally, they
               enable the simulator to handle circuits containing bipolar
               transistors such as ECL and BiCMOS.  Nonetheless, the
               switched resistor model has proved to be efficient and accurate 
               for a large class of MOS digital circuits.  Therefore, it is
               retained as just one particular model available for use in
               this framework.
               The use of piecewise linear models requires the generalization 
               of RC tree analysis.  Unlike switched resistors, more general
               models may incorporate gain and floating capacitance.
               Additionally, we extend the analysis to handle non-tree 
               topologies and feedback.  Despite the increased generality,
               for many common MOS and ECL circuits the complexity remains  
               linear.  Thus, this timing analysis can be used to simulate,
               efficiently, those portions of the circuit that are well
               described by traditional switch level models, while 
               simultaneously simulating, more accurately, those portions
               that are not.
               We present preliminary results from a prototype simulator, Mom.  
               We demonstrate its use on a number of MOS, ECL, and BiCMOS 
circuits.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/92/532/CSL-TR-92-532.pdf

%R CSL-TR-93-564
%Z Thu, 17 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Case Study in Prototyping With Rapide: Shared Memory
               Multiprocessor System
%A Santoro, Alexandre
%D March 1993
%X Rapide is a concurrent object-oriented language designed for
               prototyping distributed systems. This paper describes the
               creation of such a prototype, more specifically a shared
               memory multiprocessor system. The design is presented in an
               evolutionary manner, starting with a simple CPU + memory
               model. The paper also presents some simulation results and
               shows how the partially ordered event sets that Rapide
               produces can be used both for performance analysis and for an
               in-depth understanding of the model's behavior.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/564/CSL-TR-93-564.pdf

%R CSL-TR-93-580
%Z Wed, 09 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic Technology Mapping for Generalized Fundamental-Mode
               Asynchronous Designs
%A Siegel, Polly
%A DeMicheli, Giovanni
%A Dill, David
%D June 1993
%X The generalized fundamental-mode asynchronous design style is
               one in which the combinational portions of the circuit design
               are separated from the storage elements, as with synchronous
               design styles. Synchronous technology mapping techniques can
               be adapted to work for this asynchronous design style if
               hazards are taken into account. First, we examine each step
               of algorithmic technology mapping for its influence on the
               hazard behavior of the modified network. We then present
               modifications to an existing synchronous technology mapper to
               work for this asynchronous design style. We present efficient
               algorithms for hazard analysis that are used during the
               mapping process. These algorithms have been implemented and
               incorporated into the program CERES to produce a technology
               mapper suitable for asynchronous designs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/580/CSL-TR-93-580.pdf

%R CSL-TR-93-584
%Z Mon, 07 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Optimization of Combinational Logic Circuits Based on
               Compatible Gates
%A Damiani, Maurizio
%A Yang, Jerry Chih-Yuan
%A DeMicheli, Giovanni
%D June 1993
%X This paper presents a set of new techniques for the
               optimization of multiple-level combinational Boolean
               networks. We describe first a technique based upon the
               selection of appropriate "multiple-output" subnetworks
               (consisting of so-called "compatible gates" whose local
               functions can be optimized simultaneously. We then generalize
               the method to larger and more arbitrary subsets of gates.
               Because simultaneous optimization of local functions can take
               place, our methods are more powerful and general than Boolean
               optimization methods using "don't cares", where only
               single-gate optimization can be performed. In addition, our
               methods represent a more efficient alternative to
               optimization procedures based on Boolean relations because
               the problem can be modeled by a "unate" covering problem
               instead of the more difficult "binate" covering problem. The
               method is implemented in program ACHILLES and compares
               favorably to SIS .
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/584/CSL-TR-93-584.pdf

%R CSL-TR-93-585
%Z Tue, 15 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Rapide-1.0 Definition of the ADAGE Avionics System
%A Mann, Walter
%A Belz, Frank C.
%A Cornell, Paul
%D September 1993
%X We have used the Rapide prototyping-languages, developed by
               Stanford and TRW under the ARPA ProtoTech Program, in a
               series of exercises to model an early version of IBM's ADAGE
               software architecture for helicopter avionics systems. These
               exercises, conducted under the ARPA Domain Specific Software
               Architectures (DSSA) Program, also assisted the evolution of
               the Rapide languages. The resulting Rapide-1.0 model of the
               ADAGE architecture in this paper is substantially more
               succinct and illuminating than the original models, developed
               in Rapide-0.2 and Preliminary Rapide-1.0. All Rapide versions
               include these key features: interfaces, by which types of
               components and their possible interactions with other
               components are defined; actions, by which the events that can
               be observed or generated by such components are defined; and
               pattern-based constraints, which define properties of the
               computation of interacting components in terms of partially
               ordered sets of events. Key features of Rapide-1.0 include
               services, which abstract whole communication patterns between
               components; behavior rules, which provide a state-transition
               oriented specification of component behavior and from which
               computation component instances can be synthesized; and
               architectures, which describe implementations of components
               with a particular interface, by showing a composition of
               subordinate components and their interconnections. The
               Rapide-1.0 model is illustrated with corresponding
               diagrammatic representations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/585/CSL-TR-93-585.pdf

%R CSL-TR-93-588
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Update-Based Cache Coherence Protocols for Scalable
               Shared-Memory Multiprocessors
%A Glasco, David B.
%A Delagi, Bruce A.
%A Flynn, Michael J.
%D November 1993
%X In this paper, two hardware-controlled update-based cache
               coherence protocols are presented. The paper discusses the
               two major disadvantages of the update protocols: inefficiency
               of updates and the mismatch between the granularity of
               synchronization and the data transfer. The paper presents two
               enhancements to the update-based protocols, a write combining
               scheme and a finer grain synchronization, to overcome these
               disadvantages. The results demonstrate the effectiveness of
               these enhancements that, when used together, allow the
               update-based protocols to significantly improve the execution
               time of a set of scientific applications when compared to
               three invalidate-based protocols.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/588/CSL-TR-93-588.pdf

%R CSL-TR-93-593
%Z Mon, 12 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Performance Advantages of Integrating Message Passing in
               Cache-Coherent Multiprocessors
%A Woo, Steven Cameron
%A Singh, Jaswinder Pal
%A Hennessy, John L.
%D November 1993
%X We examine the performance benefits of integrating a
               mechanism for block data transfer (message passing) in a
               cache-coherent shared address space multiprocessor. We do
               this through a detailed study of five important computations
               that appear to be likely candidates for block transfer. We
               find that while the benefits on a realistic architecture are
               significant in some cases, they are not as substantial as one
               might initially expect. The main reasons for this are (i) the
               relatively modest fraction of time that applications spend in
               communication that is amenable to block transfer, (ii) the
               difficulty of finding enough independent computation to
               overlap with the communication latency that remains even
               after block transfer, and (iii) the fact that long cache
               lines often capture many of the benefits of block transfer.
               Of the three primary advantages of block transfer, fast
               pipelined data transfer appears to be the most successful,
               followed by the ability to overlap computation and
               communication at a coarse granularity, and finally the
               benefits of replicating communicated data in main memory. We
               also examine the impact of varying important network
               parameters and processor speed on the relative effectiveness
               of block transfer, and comment on useful features that a
               block transfer engine should support for real applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/593/CSL-TR-93-593.pdf

%R CSL-TR-93-590
%Z Thu, 17 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Effect of Fault Dropping on Fault Simulation Time
%A Pan, Rong
%A Touba, Nur A.
%A McCluskey, Edward J.
%D November 1993
%X The effect of fault dropping on fault simulation time is
               studied in this paper. An experiment was performed in which
               fault simulation times, with and without fault dropping, were
               measured for three different simulators. A speedup
               approximately between 8 and 50 for random test sets and
               between 1.5 and 9 for deterministic test sets was observed.
               The results give some indication about how much fault
               dropping speeds up fault simulation. These results also show
               the overhead of an application requiring a complete fault
               dictionary.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/590/CSL-TR-93-590.pdf

%R CSL-TR-93-591
%Z Thu, 17 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Logic Synthesis for Concurrent Error Detection
%A Touba, Nur A.
%A McCluskey, Edward J.
%D November 1993
%X The structure of a circuit determines how the effects of a
               fault can propagate and hence affects the cost of concurrent
               error detection. By considering circuit structure during
               logic optimization, the overall cost of a concurrently
               checked circuit can be minimized. This report presents a new
               technique called structure-constrained logic optimization
               (SCLO) that optimizes a circuit under the constraint that
               faults in the resulting circuit can produce only a prescribed
               set of errors. Using SCLO, circuits can be optimized for
               various concurrent error detection schemes allowing the
               overall cost for each scheme to be compared. A technique for
               quickly estimating the size of a circuit under different
               structural constraints is described. This technique enables
               rapid exploration of the design space for concurrently
               checked circuits. A new method for the automated synthesis of
               self-checking circuit implementations for arbitrary
               combinational circuits is also presented. It consists of an
               algorithm that determines the best parity-check code for
               encoding the output of a given circuit, and then uses SCLO to
               produce the functional circuit which is augmented with a
               checker to form a self-checking circuit. This synthesis
               method provides fully automated design, explores a larger
               design space than other methods, and uses simple checkers. It
               has been implemented by making modifications to SIS (an
               updated version of MIS [Brayton 87a]), and results for
               several MCNC combinational benchmark circuits are given. In
               most cases, a substantial reduction in overhead compared to a
               duplicate-and-compare implementation is achieved.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/591/CSL-TR-93-591.pdf

%R CSL-TR-93-596
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Models of Communication Latency in Shared Memory Multiprocessors
%A Byrd, Gregory T.
%D December 1993
%X We evaluate various mechanisms for data communication in
               large-scale shared memory multiprocessors. Data communication
               involves both data transmission and synchronization,
               resulting in the transfer of data between computational
               threads. We use simple analytical models to evaluate the
               communication latency for each of the mechanisms.
               The models show that efficient and opportunistic
               synchronization is the most important determinant of latency,
               followed by efficient transmission. Producer-initiated
               mechanisms, in which data is sent by its producer as it is
               produced, generally achieve lower latencies than
               consumer-initated mechanisms, in which data is retrieved as
               and when it is needed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/596/CSL-TR-93-596.pdf

%R CSL-TR-93-554
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Using a Floating-Point Multiplier's Internals for High-Radix
               Division and Square Root
%A Schwarz, Eric M.
%A Flynn, Michael J.
%D January 1993
%X A method for obtaining high-precision approximations of
               high-order arithmetic operations at low-cost is presented in
               this study. Specifically, high-precision approximations of
               the reciprocal (12 bits worst case) and square root (16 bits)
               operations are obtained using the internal hardware of a
               floating-point multiplier without the use of look-up tables.
               The additional combinatorial logic necessary is very small
               due to the reuse of existing hardware. These low-cost
               high-precision approximations are used by iterative
               algorithms to perform the operations of division and square
               root. The method presented also applies to several other
               high-order arithmetic operations. Thus, high-radix algorithms
               for high-order arithmetic operations such as division and
               square root are possible at low-cost.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/554/CSL-TR-93-554.pdf

%R CSL-TR-93-560
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Cramer Rao Bound for Discrete-Time Edge Position
%A Gatherer, Alan
%D February 1993
%X The problem of estimating the position of an edge from a
               series of samples often occurs in the fields of machine
               vision and signal processing. It is therefore of interest to
               assess the accuracy of any estimation algorithm. Previous
               work in this area has produced bounds for the continuous time
               estimator. In this paper we derive a closed form for the
               minimum variance bound (or Cramer Rao bound) for estimating
               the position of an arbitrarily shaped edge in white Gaussian
               noise for the discrete samples case. We quantify the effects
               of the sampling rate, the bandwidth of the edge, the shape of
               the edge and the size of the observation window on the
               variance of the estimator. We describe a maximum likelihood
               estimator and show that in practice this estimator requires
               fewer computations than standard correlation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/560/CSL-TR-93-560.pdf

%R CSL-TR-93-561
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Fetch Caches
%A Bray, Brian K.
%A Flynn, Michael J.
%D February 1993
%X For high performance, data caches must have a low miss rate
               and provide high bandwidth, while maintaining low latency.
               Larger and more complex set associative caches provide lower
               miss rates but at the cost of increased latency. Interleaved
               data caches can improve the available bandwidth, but the
               improvement is limited by bank conflicts and increased
               latency due to the switching networks required to distribute
               cache addresses and to route the data.
               We propose using a small buffer to reduce the data read
               latency or improve the read bandwidth of an on-chip data
               cache. We call the small read-only buffer a fetch cache. The
               fetch cache attempts to capture the immediate spatial
               locality of the data read reference stream by utilizing the
               large number of bits that can be fetched in a single access
               of an on-chip cache.
               There are two ways a processor can issue multiple
               instructions per cache access: the cache access can require
               multiple cycles (i.e. superpipelined), or multiple
               instructions are issued per cycle (i.e. superscalar). In the
               first section, we show the use of fetch caches with
               multi-cycle per access data caches. When there is a read hit
               in the fetch cache, the read request can be serviced in one
               cycle, otherwise the latency is that of the primary data
               cache. For a four line, 16 byte wide fetch cache, the hit
               rate ranged from 40 to 60 percent depending on the
               application. In the second part, we show the use of fetch
               caches when multi-accesses per cycle are requested. When
               there is a read hit in the fetch cache, a read can be
               satisfied by the fetch cache, while the primary cache
               performs another read or write request. For a four line, 16
               byte wide fetch cache, the cache bandwidth increased by 20 to
               30 percent depending on the application.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/561/CSL-TR-93-561.pdf

%R CSL-TR-93-562
%Z Thu, 26 Oct 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An Efficient Top-Down Parsing Algorithm for General
               Context-Free Grammars
%A Sankar, Sriram
%D February 1993
%X This report describes a new algorithm for top-down parsing of
               general context-free grammars. The algorithm does not require
               any changes to be made to the grammar, and can parse with
               respect to any grammar non-terminal as the start symbol. It
               is possible to generate all possible parse trees of the input
               string in the presence of ambiguous grammars. The algorithm
               reduces to recursive descent parsing on LL grammars.
               This algorithm is ideal for use in software development
               environments which include tools such as syntax-directed
               editors and incremental parsers, where the language syntax is
               an integral part of the user-interface. General context-free
               grammars can describe the language syntax more intuitively
               than, for example, LALR(1) grammars. This algorithm is also
               applicable to batch-oriented language processors, especially
               during the development of new languages, where frequent
               changes are made to the language syntax and new prototype
               parsers need to be developed quickly.
               A prototype implementation of a parser generator that
               generates parsers based on this algorithm has been built.
               Parsing speeds of around 1000 lines per second have been
               achieved on a Sun SparcStation 2.
               This demonstrated performance is more than adequate for
               syntax-directed editors and incremental parsers, and in most
               cases, is perfectly acceptable for batch-oriented language
               processors.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/562/CSL-TR-93-562.pdf

%R CSL-TR-93-566
%Z Thu, 26 Oct 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Software Testing Using Algebraic Specification Based Test
               Oracles
%A Sankar, Sriram
%A Goyal, Anoop
%A Sikchi, Prakash
%D April 1993
%X In TAV4, the first author presented a paper describing an
               algorithm to perform run-time consistency checking of
               abstract data types specified using algebraic specifications.
               This algorithm has subsequently been incorporated into a
               run-time consistency checking tool for the Anna specification
               language for Ada, and works on a subset of all possible
               algebraic specifications. The algorithm implementation can be
               considered a test oracle for algebraic specifications that
               performs its activities while the formally specified program
               is running.
               This paper presents empirical results on the use of this test
               oracle on a real-life symbol table implementation. Various
               issues that arise due to the use of algebraic specifications
               and the test oracle are discussed. 50 different errors were
               introduced into the symbol table implementation. On testing
               using the oracle, 60% of the errors were detected by the
               oracle, 35% of the errors caused Ada exceptions to be raised,
               and the remaining 5% went undetected. These results are
               remarkable, especially since the test input was simply one
               sequence of symbol table operations performed by a typical
               client.
               The cases that went undetected contained errors that required
               very specific boundary conditions to be met --- an indication
               that white box test-data generation techniques may be
               required to detect them. Hence, a combination of white-box
               test-data generation along with a specification based test
               oracle may be an extremely versatile combination in detecting
               errors.
               This paper does not address test-data generation, rather it
               illustrates the usefulness of algebraic specification based
               test oracles during run-time consistency checking. Run-time
               consistency checking should be considered a complementary
               approach to unit testing using generated test-data.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/566/CSL-TR-93-566.pdf

%R CSL-TR-93-570
%Z Mon, 28 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Frequency Domain Volume Rendering
%A Totsuka, Takashi
%A Levoy, Marc
%D April 1993
%X The Fourier projection-slice theorem allos projections of
               volume data to be generated in O(nsquare log n) time for a
               volumbe of size ncube. The method operates by extracting and
               inverse Fourier transforming 2D slices from a 3D frequency
               domain representation of the volume. Unfortunately, these
               projections do not exhibit the occlusion that is
               characteristic of conventional volume renderings. We present
               a new frequency domain volume rendering algorithm that
               replaces much of the missing depth and shape cues by
               performing shading calculations in the frequency domain
               during slice extraction. In particular, we demonstrate
               frequency domain methods for computing linear or nonlinear
               depth cueing and directional diffuse reflection. The
               resulting images can be generated an order of magnitude
               faster than volume renderings and may be more useful for many
               applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/570/CSL-TR-93-570.pdf

%R CSL-TR-93-573
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance of a Three-Stage Banyan-Based Architecture with
               Input and Output Buffers for Large Fast Packet Switches
%A Chiussi, Fabio M.
%A Tobagi, Fouad A.
%D June 1993
%X Fast packet switching, also referred to as Asynchronous
               Transfer Mode (ATM), has emerged as the most appropriate
               switching technique to handle the high data rates and the
               wide diversity of traffic requirements envisioned in
               Broadband Integrated Services Digital Networks (B-ISDN). ATM
               switches capable of meeting the challenges posed by a
               successful deployment of B-ISDN must be designed and
               implemented. Such switches should be nonblocking and capable
               of handling the highly-bursty traffic conditions that future
               anticipated applications will generate; they should be
               scalable to the large sizes expected when B-ISDN becomes
               widely deployed; accordingly, their complexity should be as
               low as possible; they should be simple to operate; namely,
               their architecture should facilitate the determination of
               whether or not a call can be accepted, and the assignment of
               a route to a call.
               In this paper, we describe an architecture, referred to as
               the Memory/Space/Memory switching fabric, which meets these
               challenges. It combines input and output shared-memory buffer
               components with space-division banyan networks, making it
               possible to build a switch with several hundred I/O ports.
               The MSM achieves output buffering, thus performing very well
               under a wide variety of traffic conditions, and is
               self-routing, thus adapting easily to different traffic
               mixes. Under bursty traffic, by implementing a backpressure
               mechanism to control the packet flow from input to output
               queues, and by properly managing the buffers, we can increase
               the average buffer occupancy; in this way, we can achieve
               important reductions in total buffer requirements with
               respect to output-buffer switches (e.g., up to 70% reduction
               with bursts of average length equal to 100 packets), use
               input and output buffers of equal sizes, and achieve
               sublinear increase of the buffer requirements with the burst
               length.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/573/CSL-TR-93-573.pdf

%R CSL-TR-93-577
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Implementation of a Three-Stage Banyan-Based Atchitecture
               with Input and Output Buffers for Large Fast Packet Switches
%A Chiussi, Fabio M.
%A Tobagi, Fouad A.
%D June 1993
%X Fast packet switching, also referred to as Asynchronous
               Transfer Mode (ATM), has emerged as the most appropriate
               switching technique for future Broadband Integrated Services
               Digital Networks (B-ISDN). A three-stage banyan-based switch
               architecture with input and output buffers has been recently
               described [Chi93]. Such architecture, also referred to as the
               Memory/Space/Memory (MSM) switching fabric, is capable of
               meeting the challenges posed by a successful deployment of
               B-ISDN; namely, it is made nonblocking with low complexity,
               and is scalable to large sizes (>1000 input/output ports); it
               supports a wide diversity of traffic patterns, including
               highly-bursty traffic; it maintains packet sequence, is
               self-routing, and is simple to operate.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/577/CSL-TR-93-577.pdf

%R CSL-TR-93-579
%Z Wed, 02 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Comparative Studies of Pipelined Circuits
%A Klass, Fabian
%A Flynn, Michael J.
%D July 1993
%X Wave pipelining is an attractive technique used in high-speed
               computer systems to speed-up pipeline rate without
               partitioning a system into pipeline stages. Although recent
               implemetations have reported very high-speed operation rates,
               a real evaluation of the advantages and disadvantages of wave
               pipelining requires a comparative study with other
               techniques, in particular the understanding of the trade-offs
               between conventional and wave pipelining is very important.
               This study is an attempt to provide approximate models which
               can be used as first-order tools for comparative study or
               sensitivity analysis of conventional and wave pipelined
               systems with different overheads. The models presented here
               are for subsystem-level pipelines. The product Latency x
               Cycle-Time is used as a measure of performance and is
               evaluated as a function of all the parameters of a design,
               such as the propagation delay of the combinational logic, the
               data skew resulting from the difference between maximum and
               minimum propagation delays through various logic paths, rise
               and fall time, the setup time, hold time, and propagation
               delay through registers, and the uncontrollable clock skew.
               In this way, an analytical basis is provided for a comparison
               between different approaches and for optimizations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/579/CSL-TR-93-579.pdf

%R CSL-TR-93-556
%Z Wed, 23 Dec 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Support for Speculative Execution in High-Performance Processors
%A Smith, Michael David
%D November 1992
%X Superscalar and superpipelining techniques increase the
               overlap between the instructions in a pipelined processor,
               and thus these techniques have the potential to improve
               processor performance by decreasing the average number of cycles 
               between the execution of adjacent instructions.  Yet, to
               obtain this potential performance benefit, an instruction
               scheduler for this high-performance processor must find the
               independent instructions within the instruction stream of an
               application to execute in parallel.  For non-numerical 
               applications, there is an insufficient number of independent
               instructions within a basic block, and consequently the
               instruction scheduler must search across the basic block
               boundaries for the extra instruction-level parallelism required
               by the superscalar and superpipelining techniques.  To exploit
               instruction-level parallelism across a conditional branch,
               the instruction scheduler must support the movement of
               instructions above a conditional branch, and the processor must
               support the speculative execution of these instructions.
               We define boosting, an architectural mechanism for speculative
               execution, that allows us to uncover the instruction-level 
               parallelism across conditional branches without adversely
               affecting the instruction count of the application or the 
               cycle time of the processor.  Under boosting, the compiler is
               responsible for analyzing and scheduling instructions, while
               the hardware is responsible for ensuring that the effects of
               a speculatively-executed instruction do not corrupt the
               program state when the compiler is incorrect in its speculation.
               To experiment with boosting, we built a global instruction
               scheduler, which is specifically tailored for the non-numerical
               environment, and a simulator, which determines the cycle-count
               performance of our globally-scheduled programs.  We also 
               analyzed the hardware requirements for boosting in a typical
               load/store architecture.  Through the cycle-count 
               simulations and an understanding of the cycle-time impact of
               the hardware support for boosting, we found that only a small
               amount of hardware support for speculative execution is 
               necessary to achieve good performance in a small-issue, 
superscalar 
               processor.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/93/556/CSL-TR-93-556.pdf

%R CSL-TR-94-599
%Z Wed, 28 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T The Design and Implementation of a High-Performance
               Floating-Point Divider
%A Oberman, Stuart
%A Quach, Nhon
%A Flynn, Michael J.
%D January 1994
%X The increasing computation requirements of modern computer
               applications have stimulated a large interest in developing
               extremely high-performance floating- point dividers. A
               variety of division algorithms are available, with SRT being
               utilized in many computer systems.A careful analysis of SRT
               divider topologies has demonstrated that a relatively simple
               divider designed in anaggressive circuit style can achieve
               extremely high performance. Further, an aggressive circuit
               implementation can minimize many of the performance
               advantages of more complex divider algorithms. This paper
               presents the tradeoffs of the different divider topologies,
               the design of the divider, and performance results.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/599/CSL-TR-94-599.pdf

%R CSL-TR-94-600
%Z Mon, 07 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Environmental Limits on the Performance of CMOS
               Wave-Pipelined Circuits
%A Nowka, Kevin J.
%A Flynn, Michael J.
%D January 1994
%X Wave-pipelining is a circuit design technique which allows
               digital synchronous systems to be clocked at rates higher
               than can be achieved with conventional pipelining techniques.
               Wave-pipelining has been successfully applied to the design
               of SSI processor functional units, a Bipolar Population
               Counter, a CMOS adder, CMOS multipliers, and several simple
               CMOS circuits. For controlled operating environments,
               speed-ups of 2 to 10 have been reported for these designs.
               This report details the effects of temperature variation,
               supply voltage variation, and process variation on
               wave-pipelined static CMOS designs, derives limits for the
               performance of wave-pipelined circuits due to these
               variations, and compares the performance effects with those
               of traditional pipelined circuits.
               This study finds that wave-pipelined circuits designed for
               commercial operating environments are limited to 2 to 3 waves
               per pipeline stage when clocked from a fixed frequency
               source. Variable rate, internal clocking can approach the
               theoretical limit of waves at a cost of interface complexity.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/600/CSL-TR-94-600.pdf

%R CSL-TR-94-601
%Z Mon, 21 Mar 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory.
%T Efficient Scheduling on Multiprogrammed Shared-Memory
               Multiprocessors
%A Tucker, Andrew
%D March 1994
%X Shared-memory multiprocessors are often used as compute
               servers, with multiple users running applications in a
               multiprogrammed style. On such systems, naive time-sharing
               scheduling policies can result in poor performance for
               parallel applications. Most parallel applications are written
               with the model of a stable computing environment, where
               applications are running uninterrupted on a fixed number of
               processors. On a time-sharing system, processes are
               interrupted periodically and the number of processors running
               an application continually varies. The result is an decrease
               in performance for a number of reasons, including processes
               being obliviously preempted inside critical sections and
               cached data being replaced by intervening processes. 
               This thesis explores using more sophisticated scheduling
               systems to avoid these problems. Robust implementations of
               previously proposed approaches involving cache affinity
               scheduling and gang scheduling are developed and evaluated.
               It then presents the design, implementation, and performance
               of process control, a novel scheduling approach using
               explicit cooperation between the application and kernel to
               minimize context switching. Performance results from a suite
               of workloads containing both serial and parallel
               applications, run on a 4-processor Silicon Graphics
               workstation, confirm the effectiveness of the process control
               approach.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/601/CSL-TR-94-601.pdf

%R CSL-TR-94-604
%Z Thu, 22 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Integrating multiple communication paradigms in high
               performance multiprocessors
%A Heinlein, John
%A Gharachorloo, Kourosh
%A Gupta, Anoop
%D February 1994
%X In the design of FLASH, the successor to the Stanford DASH
               multiprocessor, we are exploring architectural mechanisms for
               efficiently supporting both the shared memory and message
               passing communication models in a single system. The unique
               feature in the FLASH (FLexible Architecture for SHared
               memory) system is the use of a programmable controller at
               each node that replaces the functionality of hardwired cache
               coherence state machines in systems like DASH. The base
               coherence protocol is supported by executing appropriate
               software handlers on the programmable controller to service
               memory and coherence operations. The same programmable
               controller is also used to support message passing. This
               approach is attractive because of the flexibility software
               provides for implementing different coherence and message
               passing protocols, and because of the simplification in
               system design and debugging that arises from the shift of
               complexity from hardware to software.
               This paper focuses on the use of the programmable controller
               to support message passing. Our goal is to provide message
               passing performance that is comparable to an aggressive
               hardware implementation dedicated to this task. In FLASH,
               message data is transferred as a sequence of cache line sized
               units, thus exploiting the datapath support already present
               for cache coherence. In addition, we avoid costly interrupts
               to the main processor by having the programmable engine
               handle the control for message transfers. Furthermore, in
               contrast to most earlier work, we provide an integrated
               solution that handles the interaction of message data with
               virtual memory, protected multiprogramming, and cache
               coherence. Our preliminary performance studies indicate that
               this system can sustain message transfers at a rate of
               several hundred megabytes per second, efficiently utilizing
               the available network bandwidth.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/604/CSL-TR-94-604.pdf

%R CSL-TR-94-613
%Z Wed, 26 Oct 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Design and Validation of Update-Based Cache Coherence
               Protocols
%A Glasco, David B.
%A Delagi, Bruce A.
%A Flynn, Michael J.
%D March 1994
%X In this paper, we present the details of the two update-based
               cache coherence protocols for scalable shared-memory
               multiprocessors that were studied in our previous work.
               First, the directory structures required for the protocols
               are briefly reviewed. Next, the state diagrams and some
               examples of the two update-based protocols are presented; one
               of the protocols is based on a centralized directory, and the
               other is based on a singly-linked distributed directory.
               Protocol deadlock and the additional requirements placed the
               protocols to avoid such deadlock are also examined. Finally,
               protocol validation using an exhaustive validation tool known
               as Murphi is discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/613/CSL-TR-94-613.pdf

%R CSL-TR-94-614
%Z Mon, 21 Mar 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory.
%T Co-Synthesis of Hardware and Software for Digital
               Embedded Systems
%A Gupta, Rajesh Kumar
%D December 1993
%X As the complexity of systems being subject to computer-aided
               synthesis and optimization techniques increases, so does the
               need to find ways to incorporate predesigned components into
               the final system implementation. In this context, a
               general-purpose microprocessor provides a sophisticated
               low-cost component that can be tailored to realize most
               system functions through appropriate software. This approach
               is particularly useful in the design of embedded systems that
               have a relatively simple target architecture, when compared
               to general-purpose computing systems such as workstations. In
               embedded systems the processor is used as a resource
               dedicated to implement specific functions. However, the
               design issues in embedded systems are complicated since most
               of these systems operate in a time-constrained environment.
               Recent advances in chip-level synthesis have made it possible
               to synthesize application-specific circuits under strict
               timing constraints. This dissertation formulates the problem
               of computer-aided design of embedded systems using both
               application-specific as well as general-purpose
               reprogrammable components under timing constraints. 
               Given a specification of system functionality and constraints
               in a hardware description language, we model the system as a
               set of bilogic flow graphs, and formulate the co-synthesis
               problem as a partitioning problem under constraints. Timing
               constraints are used to determine the parts of the system
               functionality that are delegated to application-specific
               hardware and the software that runs on the processor. The
               software component of such a 'mixed' system poses an
               interesting problem due to its interaction with concurrently
               operating hardware. We address this problem by generating
               software as a set of concurrent fixed-latency serialized
               operations called threads. The satisfaction of the imposed
               performance constraints is then ensured by exploiting
               concurrency between program threads, achieved by an
               inter-leaved execution on a single processor system. 
               This co-synthesis of hardware and software from behavioral
               specifications makes it possible to build time-constrained
               embedded systems by using off-the-shelf parts and
               application-specific circuitry. Due to the reduction in size
               of application-specific hardware needed compared to an
               all-hardware solution, the needed hardware component can be
               easily mapped to semicustom VLSI such as gate arrays, thus
               shortening the design time. In addition, the ability to
               perform a detailed analysis of timing performance provides an
               opportunity to improve the system definition by creating
               better prototypes. The algorithms and techniques described
               have been implemented in a framework called Vulcan, which is
               integrated with the Stanford Olympus Synthesis System and
               provides a path from chip-level synthesis to system-level
               synthesis.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/614/CSL-TR-94-614.pdf

%R CSL-TR-94-618
%Z Wed, 05 Oct 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Optimum Routing of Multicast Audio and Video Streams in
               Communications Networks
%A Noronha, Ciro A., Jr.
%A Tobagi, Fouad A.
%D April 94
%X In this report, we consider the problem of routing multicast
               audio and video streams in a communications network. After
               describing the previous work in the area and identifying its
               shortcomings, we show that the problem of optimally routing
               multicast streams can be formulated as an integer programming
               problem. We propose an efficient solution technique, composed
               of two parts: (i) an extension to the decomposition
               principle, to speed up the linear relaxation of the problem,
               and (ii) enhanced value-fixing rules, to prune the search
               space for the integer problem. We characterize the reduction
               in run time gained using these techniques. Finally, we
               compare the run times for the optimum multicast routing
               algorithm and for existing heuristic algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/618/CSL-TR-94-618.pdf

%R CSL-TR-94-619
%Z Wed, 05 Oct 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Evaluation of Multicast Routing Algorithms for Multimedia
               Streams
%A Noronha, Ciro A., Jr.
%A Tobagi, Fouad A.
%D April 1994
%X Multimedia applications place new requirements on networks as
               compared to traditional data applications: (i) they require
               relatively high bandwidths on a continuous basis for long
               periods of time; (ii) involve multipoint communications and
               thus are expected to make heavy use of multicasting; and
               (iii) tend to be interactive and thus require low latency.
               These requirements must be taken into account when routing
               multimedia traffic in a network. This report presents a
               performance evaluation of routing algorithms in the
               multimedia environment, where the requirements of multipoint
               communications, bandwidth and latency must be satisfied. We
               present an exact solution to the optimum multicast routing
               problem, based on integer programming, and use this solution
               as a benchmark to evaluate existing heuristic algorithms,
               considering both performance and cost of implementation (as
               measured by the average run time), under realistic network
               and traffic scenarios.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/619/CSL-TR-94-619.pdf

%R CSL-TR-94-620
%Z Thu, 10 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The SUIF Compiler System: a Parallelizing and
 	       Optimizing Research Compiler
%A Wilson, Robert
%A French, Robert
%A Wilson, Christopher
%A Amarasinghe, Saman
%A Anderson, Jennifer
%A Tjiang, Steve
%A Liao, Shih-Wei
%A Tseng, Chau-Wen
%A Hall, Mary
%A Lam, Monica
%A Hennessy, John
%D May 1994
%X Compiler infrastructures that support experimental research
               are crucial to the advancement of high-performance computing.
               New compiler technology must be implemented and evaluated in
               the context of a complete compiler, but developing such an
               infrastructure requires a huge investment in time and
               resources. We have spent a number of years building the SUIF
               compiler into a powerful, flexible system, and we would now
               like to share the results of our efforts.
               SUIF consists of a small, clearly documented kernel and a
               toolkit of compiler passes built on top of the kernel. The
               kernel defines the intermediate representation, provides
               functions to access and manipulate the intermediate
               representation, and structures the interface between compiler
               passes. The toolkit currently includes C and Fortran front
               ends, a loop-level parallelism and locality optimizer, an
               optimizing MIPS back end, a set of compiler development
               tools, and support for instructional use.
               Although we do not expect SUIF to be suitable for everyone,
               we think it may be useful for many other researchers. We thus
               invite you to use SUIF and welcome your contributions to this
               infrastructure. The SUIF software is freely available via
               anonymous ftp from suif.Stanford.EDU. Additional information
               about SUIF can be found on the World-Wide Web at
               http://suif.Stanford.EDU.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/620/CSL-TR-94-620.pdf

%R CSL-TR-94-621
%Z Wed, 09 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Synthesis for Scan Dependence in Built-In Self-Testable
               Designs
%A Avra, LaNae J.
%A McCluskey, Edward J.
%D May 1994
%X This report introduces new design and synthesis techniques
               that reduce the area and improve the performance of embedded
               built-in self-test (BIST) architectures such as circular BIST
               and parallel BIST. Our goal is to arrange the system
               bistables into scan paths so that some of the BIST and scan
               logic is shared with the system logic. Logic sharing is
               possible when scan dependence is introduced in the design.
               Other BIST design techniques attempt to avoid all types of
               scan dependence because it can reduce the fault coverage of
               embedded, multiple input signature registers (MISRs). We show
               that introducing certain types of scan dependence in embedded
               MISRs can result in reduced overhead and improved fault
               coverage, and we describe synthesis techniques that maximize
               the amount of this beneficial scan dependence. Finally, we
               present fault simulation, layout area, and delay results for
               circular BIST versions of benchmark circuits that have been
               synthesized with our techniques.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/621/CSL-TR-94-621.pdf

%R CSL-TR-94-622
%Z Wed, 09 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Synthesis-for-Test Design System
%A Avra, LaNae J.
%A Gerbaux, Laurent
%A Giomi, Jean-Charles
%A Martinolle, Francoise
%A McCluskey, Edward J.
%D May 1994
%X Hardware synthesis techniques automatically generate a
               structural hardware implementation given an abstract (e.g.,
               functional, behavioral, register transfer) description of the
               behavior of the design. Existing hardware synthesis systems
               typically use cost and performance as the main criteria for
               selecting the best hardware implementation, and seldom even
               consider test issues during the synthesis process. We have
               developed and implemented a computer-aided design tool whose
               primary objective is to generate the lowest-cost,
               highest-performance hardware implementation that also meets
               specified testability requirements. By considering
               testability during the synthesis process, the tool is able to
               generate designs that are optimized for specific test
               techniques. The input to the tool is a behavioral VHDL
               specification that consists of high-level software language
               constructs such as conditional statements, assignment
               statements, and loops, and the output is a structural VHDL
               description of the design. Implemented synthesis procedures
               include compiler optimizations, inter-process analysis,
               high-level synthesis operations (scheduling, allocation, and
               binding) and control logic generation. The purpose of our
               design tool is to serve as a platform for experimentation
               with existing and future synthesis-for-test techniques, and
               it can currently generate designs optimized for both parallel
               and circular built-in self-test architectures.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/622/CSL-TR-94-622.pdf

%R CSL-TR-94-623
%Z Wed, 12 Oct 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Communication Mechanisms in Shared Memory Multiprocessors
%A Byrd, Gregory T.
%A Delagi, Bruce A.
%A Flynn, Michael J.
%D May 1994
%X Shared memory systems generally support consumer-initiated
               communication; when a process needs data, it is retrieved
               from the global memory. Systems that were designed around the
               message passing model, on the other hand, support
               producer-initiated communication mechanisms; the producer of
               data sends it directly to the other processes that require
               it. Parallel applications require both kinds of
               communication.
               In this paper, we examine the performance of five
               shared-memory communication mechanisms -- invalidate-based
               cache coherence, prefetch, locks, deliver, and StreamLine --
               to determine the effectiveness of architectural support for
               efficient producer-initiated communication. We find that
               StreamLine, a cached-based message passing mechanism, offers
               the best performance on our simulated benchmarks. In
               addition, StreamLine is much less sensitive to system
               parameters such as cache line size and network performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/623/CSL-TR-94-623.pdf

%R CSL-TR-94-626
%Z Wed, 12 Oct 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Synthesis and Optimization of Synchronous Logic Circuits
%A Damiani, Maurizio
%D June 1994
%X The design automation of complex digital circuits offers
               important benefits. It allows the designer to reduce design
               time and errors, to explore more thoroughly the design space,
               and to cope effectively with an ever-increasing project
               complexity.
               This dissertation presents new algorithms for the logic
               optimization of combinational and synchronous digital
               circuits. These algorithms rely on a common paradigm. Namely,
               global optimization is achieved by the iterative local
               optimization of small subcircuits.
               The dissertation first explores the combinational case.
               Chapter 2 presents algorithms for the optimization of
               subnetworks consisting of a single-output subcircuit. The
               design space for this subcircuit is described implicitly by a
               Boolean function, a so-called function . Efficient methods
               for extracting this function are presented.
               Chapter 3 is devoted to a novel method for the optimization
               of multiple-output subcircuits. There, we introduce the
               notion of compatible gates . Compatible gates represent
               subsets of gates whose optimization is particularly simple.
               The other three chapters are devoted to the optimization of
               synchronous circuits. Following the lines of the
               combinational case, we attempt the optimization of the
               gate-level (rather than the state diagram -level)
               representation. In Chapter 4 we focus on extending
               combinational techniques to the sequential case. In
               particular, we present algorithms for finding a synchronous
               function that can be used in the optimization process.
               Unlike the combinational case, however, this approach is
               exact only for pipeline-like circuits. Exact approaches for
               general, acyclic circuits are presented in Chapter 5. There,
               we introduce the notion of synchronous recurrence equation.
               Eventually, Chapter 6 presents methods for handling feedback
               interconnection.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/626/CSL-TR-94-626.pdf

%R CSL-TR-94-627
%Z Wed, 12 Oct 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An Efficient Shared Memory Layer for Distributed Memory
	       Machines.
%A Scales, Daniel J.
%A Lam, Monica S.
%D July 1994
%X This paper describes a system called SAM that simplifies the
               task of programming machines with distributed address spaces
               by providing a shared name space and dynamic caching of remotely
               accessed data. SAM makes it possible to utilize the
               computational power available in networks of workstations and
               distributed memory machines, while getting the ease of
               programming associated with a single address space model. The
               global name space and caching are especially important for
               complex scientific applications with irregular communication and
               parallelism.
               SAM is based on the principle of tying synchronization with data
               accesses. Precedence constraints are expressed by accesses to
               single-assignment values, and mutual exclusion constraints are
               represented by access to data items called accumulators.
               Programmers easily express the communication and synchronization
               between processes using these operations; they can also use
               alternate paradigms tyhat are built with the SAM primitives.
               Operations for prefetching data and explicitly sending data to
               another processor integrate cleanly with SAM's shared memory
               model and allow the user to obtain the efficiency of message
               passing when necessary.
               We have built implementations of SAM for the CM-5, the Intel
               iPSC/860, the Intel Paragon, the IBM SP1, and heterogeneous
               networks of Sun, SGI, and DEC workstations (using PVM). In this
               report, we describe the basic functionality provided by SAM,
               discuss our experience in using it to program a variety of
               scientific applications and distributed data structures, and
               provide performance results for these complex applications on a
               range of machines. Our experience indicates that SAM
               significantly simplifies the programming of these parallel
               systems, supports the necessary functionality for developing
               efficient implementations of sophisticated applications, and
               provides portability across a range of distributed memory
               environments.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/627/CSL-TR-94-627.pdf

%R CSL-TR-94-628
%Z Mon, 09 Jan 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Tolerating Latency Through Software-Controlled Data
               Prefetching
%A Mowry, Todd C.
%D June 1994
%X The large latency of memory accesses in modern computer
               systems is a key obstacle to achieving high processor
               utilization. Furthermore, the technology trends indicate that
               this gap between processor and memory speeds is likely to
               increase in the future. While increased latency affects all
               computer systems, the problem is magnified in large-scale
               shared-memory multiprocessors, where physical dimensions
               cause latency to be an inherent problem. To cope with the
               memory latency problem, the basic solution that nearly all
               computer systems rely on is their cache hierarchy. While
               caches are useful, they are not a panacea.
               Software-controlled prefetching is a technique for tolerating
               memory latency by explicitly executing prefetch instructions
               to move data close to the processor before it is actually
               needed. This technique is attractive because it can hide both
               read and write latency within a single thread of execution
               while requiring relatively little hardware support.
               Software-controlled prefetching, however, presents two major
               challenges. First, some sophistication is required on the
               part of either the programmer, runtime system, or
               (preferably) the compiler to insert prefetches into the code.
               Second, care must be taken that the overheads of prefetching,
               which include additional instructions and increased memory
               queueing delays, do not outweigh the benefits.
               This dissertation proposes and evaluates a new compiler
               algorithm for inserting prefetches into code. The proposed
               algorithm attempts to minimize overheads by only issuing
               prefetches for references that are predicted to suffer cache
               misses. The algorithm can prefetch both dense-matrix and
               sparse-matrix codes, thus covering a large fraction of
               scientific applications. It also works for both uniprocessor
               and large-scale shared-memory multiprocessor architectures.
               We have implemented our algorithm in the SUIF (Stanford
               University Intermediate Form) optimizing compiler. The
               results of our detailed architectural simulations demonstrate
               that the speed of some applications can be improved by as
               much as a factor of two, both on uniprocessor and
               multiprocessor systems. This dissertation also compares
               software-controlled prefetching with other latency-hiding
               techniques (e.g., locality optimizations, relaxed consistency
               models, and multithreading), and investigates the
               architectural support necessary to make prefetching
               effective.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/628/CSL-TR-94-628.pdf

%R CSL-TR-94-629
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Precise Delay Generation Using Coupled Oscillators
%A Maneatis, John George
%D June 1994
%X This thesis describes a new class of delay generation
               structures which can produce precise delays with sub- gate
               delay resolution. These structures are based on coupled ring
               oscillators which oscillate at the same frequency. One such
               structure, called an array oscillator, consists of a linear
               array of ring oscillators. A unique coupling arrangement
               forces the outputs of the ring oscillators to be uniformly
               offset in phase by a precise fraction of a buffer delay. This
               arrangement enables the array oscillator to achieve a delay
               resolution equal to a buffer delay divided by the number of
               rings. Another structure, called a delay line oscillator,
               consists of a series of delay stages, each based on a single
               coupled ring oscillator. These delay stages uniformly span
               the delay interval to which they are phase locked. Each delay
               stage is capable of generating a phase shift that varies over
               a positive and negative range. These characteristics allow
               the structure to precisely subdivide delays into arbitrarily
               small intervals.
               The buffer stages used in the ring oscillators must have high
               supply noise rejection to avoid losing precision to output
               jitter. This thesis presents several types of buffer stage
               designs for achieving high supply noise rejection and low
               supply voltage operation. These include a differential buffer
               stage design based on a source coupled pair using load
               elements with symmetric I-V characteristics and a
               single-ended buffer stage design based on a diode clamped
               common source device. The thesis also discusses techniques
               for achieving low jitter phase-locked loop performance which
               is important to achieving high precision.
               Based on the concepts developed in this thesis, an
               experimental differential array oscillator delay generator
               was designed and fabricated in a 1.2-um N- well CMOS
               technology. The delay generator achieved a delay resolution
               of 43ps while operating at 331MHz with peak delay error of
               47ps.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/629/CSL-TR-94-629.pdf

%R CSL-TR-94-630
%Z Thu, 27 Oct 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Expansion Caches For Superscalar Processors
%A Johnson, John D.
%D June 1994
%X Superscalar implementations present increased demands on
               instruction caches as well as instruction decoding and
               issuing mechanisms leading to very complex hardware
               requirements. This work proposes utilizing an expanded
               instruction cache to reduce and simplify the complexity of
               hardware required to implement a superscalar machine. Trace
               driven simulation is used for evaluating the presented
               Expanded Parallel Instruction Cache (EPIC) machine and its
               performance is found to be comparable to a dynamically
               scheduled superscalar model.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/630/CSL-TR-94-630.pdf

%R CSL-TR-94-632
%Z Mon, 07 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Benefits of Clustering in Shared Address Space
               Multiprocessors: An Applications-Driven Investigation
%A Erlichson, Andrew
%A Nayfeh, Basem A.
%A Singh, Jaswinder Pal
%A Olukotun, Kunle
%D October 1994
%X Clustering processors together at a level of the memory
               hierarchy in shared address space multiprocessors appears to
               be an attractive technique from several standpoints:
               Resources are shared, packaging technologies are exploited,
               and processors within a cluster can share data more
               effectively. We investigate the performance benefits that can
               be obtained by clustering on a range of important scientific
               and engineering applications. We find that in general
               clustering is not very effective in reducing the inherent
               communication to computation ratios. Clustering is more
               useful in reducing working set requirements in unstructured
               applications, and can improve performance substantially when
               small first level caches are clustered in these cases. This
               suggests that clustering at the first level cache might be
               useful in highly-integrated, relatively fine-grained
               environments. For less integrated machines such as current
               distributed shared memory multiprocessors, our results
               suggest that clustering is not very useful in improving
               application performance, and the decision about whether or
               not to cluster should be made on the basis of engineering and
               packaging constraints.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/632/CSL-TR-94-632.pdf

%R CSL-TR-94-633
%Z Wed, 09 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Synthesis Techniques for Built-In Self-Testable Designs
%A Avra, LaNae Joy
%D July 1994
%X This technical report contains the text of LaNae Joy Avra's
               thesis "Synthesis Techniques for Built-In Self-Testable
               Designs."
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/633/CSL-TR-94-633.pdf

%R CSL-TR-94-634
%Z Wed, 05 Oct 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Architectural and Implementation Tradeoffs for
               Multiple-Context Processors
%A Laudon, James P.
%D September 1994
%X Tolerating memory latency is essential to achieving high
               performance in scalable shared-memory multiprocessors. In
               addition, tolerating instruction (pipeline dependency)
               latency is essential to maximize the performance of
               individual processors. Multiple-context processors have been
               proposed as a universal mechanism to mitigate the negative
               effects of latency. These processors tolerate latency by
               switching to a concurrent thread of execution whenever one of
               the threads blocks due to a high-latency operation. Multiple
               context processors built so far, however, either have a high
               context-switch cost which disallows tolerance of short
               latencies (e.g., due to pipeline dependencies), or
               alternatively they require excessive concurrency from the
               software.
               We propose a multiple-context architecture that combines full
               single-thread support with cycle-by-cycle context
               interleaving to provide lower switch costs and the ability to
               tolerate short latencies. We compare the performance of our
               proposal with that of earlier approaches, showing that our
               approach offers substantially better performance for parallel
               applications. We also explore using our approach for
               uniprocessor workstations --- an important environment for
               commodity microprocessors. We show that our approach also
               offers much better performance for multiprogrammed
               uniprocessor workloads.
               Finally, we explore the implementation issues for both our
               proposed and existing multiple-context architectures. One of
               the larger costs for a multiple-context processor arises in
               providing a cache capable of handling multiple outstanding
               requests, and we propose a lockup-free cache which provides
               high performance at a reasonable cost. We also show that
               amount of processor state that needs to be replicated to
               support multiple contexts is modest and the extra complexity
               required to control the multiple contexts under both our
               proposed and existing approaches is manageable. The
               performance benefits and reasonable implementation cost of
               our approach make it a promising candidate for addition to
               future microprocessors.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/634/CSL-TR-94-634.pdf

%R CSL-TR-94-635
%Z Thu, 01 Sep 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T A Performance/Area Workbench for Cache Memory Design
%A Okuzawa, Osamu
%A Flynn, Michael J.
%D August 1994
%X For high performance processor design, cache memory size is
               an important parameter which directly affects performance and
               the chip area. Modeling performance and area is required for
               design tradeoff of cache memory. This paper describes a tool
               which calculates cache memory performance and area. A
               designer can try a variety of cache parameters to complete the
               specification of a cache memory. Data examples calculated
               using this tool are shown.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/635/CSL-TR-94-635.pdf

%R CSL-TR-94-636
%Z Thu, 27 Oct 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science
%T Mable: A Technique for Efficient Machine Simulation
%A Davies, Peter
%A Lacroute, Philippe
%A Heinlein, John
%A Horowitz, Mark
%D October 1994
%X We present a framework for an efficient instruction-level
               machine simulator which can be used with existing software
               tools to develop and analyze programs for a proposed
               processor architecture. The simulator exploits similarities
               between the instruction sets of the emulated machine and the
               host machine to provide fast simulation. Furthermore,
               existing program development tools on the host machine such
               as debuggers and profilers can be used without modification
               on the emulated program running under the simulator. The
               simulator can therefore be used to debug and tune application
               code for the new processor without building a whole new set
               of program development tools. The technique has applicability
               to a diverse set of simulation problems. We show how the
               framework has been used to build simulators for a
               shared-memory multiprocessor, a superscalar processor with
               support for speculative execution, and a dual-issue embedded
               processor.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/636/CSL-TR-94-636.pdf

%R CSL-TR-94-637
%Z Mon, 28 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Testing Digital Circuits for Timing Failures by Output
               Waveform Analysis
%A Franco, Piero
%D September 1994
%X Delay testing is done to ensure that a digital circuit
               functions at the designed speed. Delay testing is complicated
               by test invalidation and fault detection size. Furthermore,
               we show that simple delay models are not sufficient to
               provoke the longest delay through a circuit. Even if all
               paths are robustly tested, path delay testing cannot
               guarantee that the circuit functions at the desired speed.
               Output Waveform Analysis is a new approach for detecting
               timing failures in digital circuits. Unlike conventional
               testing where the circuit outputs are sampled, the waveform
               between samples is analyzed. The motivation is that delay
               changes affect the shape of the output waveform, and
               information can be extracted from the waveform to detect
               timing failures. This is especially useful as a
               Design-for-Testability technique for Built-In Self-Test or
               pseudo-random testing environments, where delay tests are
               difficult to apply and test invalidation is a problem.
               Stability Checking is a simple form of Output Waveform
               Analysis. In a fault-free circuit, the outputs are expected
               to have reached the desired logic values by the time they are
               sampled, so delay faults can be detected by observing the
               outputs for any changes after the sampling time. Apart from
               traditional delay testing, Stability Checking is also useful
               for on-line or concurrent testing under certain timing
               restrictions. A padding algorithm was implemented to show
               that circuits can be efficiently modified to meet the
               required timing constraints.
               By analyzing the output waveform before the sampling time,
               circuits with timing flaws can be detected even before the
               circuit fails. This is useful in high reliability
               applications as a screening technique that does not stress
               the circuit, and for wear-out prediction.
               A symbolic waveform simulator has been implemented to show
               the benefits of the proposed Output Waveform Analysis
               techniques. Practical test architectures have been designed,
               and various waveform analyzers have been manufactured and
               tested. These include circuits implemented using the Stanford
               BiCMOS process, and a design implemented in a 25k gate Test
               Evaluation Chip Experiment.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/637/CSL-TR-94-637.pdf

%R CSL-TR-94-638
%Z Wed, 12 Oct 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Design, Implementation and Evaluation of Jade: A
               Portable, Implicitly Parallel Programming Language
%A Rinard, Martin C.
%D August 1994
%X Over the last decade, research in parallel computer
               architecture has led to the development of many new parallel
               machines. These machines have the potential to dramatically
               increase the resources available for solving important
               computational problems. The widespread use of these machines,
               however, has been limited by the difficulty of developing
               useful parallel software. This thesis presents the design,
               implementation and evaluation of Jade, a new programming
               language for parallel computations that exploit task-level
               concurrency.
               Jade is structured as a set of constructs that programmers
               use to specify how a program written in a standard
               sequential, imperative language accesses data. The
               implementation dynamically analyzes these specifications to
               automatically extract the concurrency and map the computation
               onto the parallel machine. The resulting parallel execution
               preserves the semantics of the original serial program.
               We have implemented Jade on a wide variety of parallel
               computing platforms: shared-memory multiprocessors such as
               the Stanford DASH machine, homogeneous message-passing
               machines such as the Intel iPSC/860, and on heterogeneous
               networks of workstations. Jade programs port without
               modification between all of these platforms.
               We evaluate the design and implementation of Jade by
               parallelizing several complete scientific and engineering
               applications in Jade and executing these applications on
               several computational platforms. We analyze how well Jade
               supports the process of developing these applications and
               present results that characterize how well they perform.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/638/CSL-TR-94-638.pdf

%R CSL-TR-94-639
%Z Wed, 05 Oct 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Two Case Studies in Latency Tolerant Architectures
%A Bennett, James E.
%A Flynn, Michael J.
%D October 1994
%X Researchers have proposed a variety of techniques for dealing
               with memory latency, such as dynamic scheduling, hardware
               prefetching, software prefetching, and multiple contexts.
               This paper presents the results of two case studies on the
               usefulness of some simple techniques for latency tolerance.
               These techniques are nonblocking caches, reordering of loads
               and stores, and basic block scheduling for the expected
               latency of loads. The effectiveness of these techniques was
               found to vary according to the type of application. While
               nonblocking caches and load/store reordering consistently
               improved performance, scheduling based on expected latency
               was found to decrease performance in most cases. This result
               shows that the assumption of a uniform miss rate used by the
               scheduler is incorrect, and suggests that techniques for
               estimating the miss rates of individual loads are needed.
               These results were obtained using a new simulation
               environment, MXS, currently under development.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/639/CSL-TR-94-639.pdf

%R CSL-TR-94-640
%Z Mon, 28 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Transformed Pseudo-Random Patterns for BIST
%A Touba, Nur A.
%A McCluskey, Edward J.
%D October 1994
%X This paper presents a new approach for on-chip test pattern
               generation. The set of test patterns generated by a
               pseudo-random pattern generator (e.g., an LFSR) is
               transformed into a new set of patterns that provides the
               desired fault coverage. The transformation is performed by a
               small amount of mapping logic that decodes sets of patterns
               that don't detect any new faults and maps them into patterns
               that detect the hard-to-detect faults. The mapping logic is
               purely combinational and is placed between the pseudo-random
               pattern generator and the circuit under test (CUT). A
               procedure for designing the mapping logic so that it
               satisfies test length and fault coverage requirements is
               described. Results are shown for benchmark circuits which
               indicate that an LFSR plus a small amount of mapping logic
               reduces the test length required for a particular fault
               coverage by orders of magnitude compared with using an LFSR
               alone. These results are compared with previously published
               results for other methods, and it is shown that the proposed
               method requires much less overhead to achieve the same fault
               coverage for the same test length.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/640/CSL-TR-94-640.pdf

%R CSL-TR-94-642
%Z Wed, 09 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An Apparatus for Pseudo-Deterministic Testing
%A Mukund, Shridhar K.
%A McCluskey, Edward J.
%A Rao, T.R.N.
%D October 1994
%X Pseudo-random testing is popularly used, particularly in
               Built-In Self Test (BIST) applications. To achieve a desired
               fault coverage, pseudo-random patterns are often supplemented
               with few deterministic patterns. When positions of
               deterministic patterns in the pseudo-random sequence are
               known a priori, pseudo-random sub-sequences can be chosen
               such that they also cover these deterministic patterns. We
               call this method of test application, pseudo-deterministic
               testing. The theory of discrete logarithm has been applied to
               determine positions of bit-patterns in the pseudo-random
               sequence generated by a modular form or internal-XOR Line ar
               Feedback Shift Register (LFSR) [5,7]. However, the scheme
               requires that all the inputs of the combinational logic block
               (CLB), under test, come from the same LFSR source. This
               constraint in circuit configuration severely limits its
               application.
               In this paper, we propose a practical and cost effective
               technique for pseudo-de terministic testing. For most part,
               the problem of circuit configuration has been simplified to
               one of scan path insertion, by employing LFSR/SR (an
               arbitrary length shift register driven by a standard form or
               external-XOR LFSR). To enable the usage of LFSR/SR as a
               pseudo-deterministic pattern source, we propose a method to
               determine positions of bit-patterns, at arbitrarily chosen
               tap configurations, in the LFSR/SR sequence.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/642/CSL-TR-94-642.pdf

%R CSL-TR-94-643
%Z Tue, 20 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design-for-Current-Testability (DFCT) for Dynamic CMOS Logic
%A Ma, Siyad C.
%A McCluskey, Edward J.
%D November 1994
%X The applicability of quiescent current monitoring (IDDQ
               testing) to dynamic logic is discussed here. IDDQ is very
               useful in detecting some defects that can escape functional
               and delay tests, however, we show that some defects in domino
               logic cannot be detected by either voltage or current
               measurements. A design-for-current-testability (DFCT)
               modification for dynamic logic is presented and shown to
               enable detection of these defects. The DFCT circuitry is
               designed with a negligible performance impact during normal
               operation. This is particularly important since the main
               reason for using dynamic logic is because of its speed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/643/CSL-TR-94-643.pdf

%R CSL-TR-94-644
%Z Wed, 21 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Synthesis of Asynchronous Controllers for Heterogeneous
               Systems
%A Yun, Kenneth Yi
%D August 1994
%X There are two synchronization mechanisms used in digital
               systems: synchronous and asynchronous. Synchronous or
               asynchronous refers to whether the system events occur in
               lock-step based on a clock or not. Today's system components
               typically employ the synchronous paradigm primarily because
               of the availability of the rich set of design tools and
               algorithms and, perhaps, because of the designers' perception
               of ``ease of design'' and the lack of alternatives. Even so,
               the interfaces among the system components do not strictly
               adhere to the synchronous paradigm because of the cost
               benefit of mixing modules operating at different clock rates
               and modules with asynchronous interfaces. This thesis
               addresses the problem of how to synthesize controllers
               operating in heterogeneous systems - systems with components
               employing different synchronization mechanisms.
               We introduce a new design style called extended-burst-mode.
               The extended-burst-mode design style covers a wide spectrum
               of sequential circuits ranging from delay-insensitive to
               synchronous. We can synthesize multiple-input change
               asynchronous finite state machines, and many circuits that
               fall in the gray area between synchronous and asynchronous
               which are difficult or impossible to synthesize automatically
               using existing methods. Our implementation of
               extended-burst-mode machines uses standard combinational
               logic, generates low-latency outputs and guarantees freedom
               from hazards at the gate level.
               We present a complete set of automated sequential synthesis
               algorithms: hazard-free state assignment, hazard-free state
               minimization, and critical-race-free state encoding. We also
               describe two radically different hazard-free combinational
               synthesis methods: two-level sums-of-products implementation
               and multiplexor trees implementation. Existing theories for
               hazard-free combinational synthesis are extended to handle
               non-monotonic input changes. A set of requirements for
               freedom from logic hazards is presented for each
               combinational synthesis method. Experimental data from a
               large set of examples are presented and compared to competing
               methods, whenever possible.
               To demonstrate the effectiveness of the design style and the
               synthesis tool, the design of a commercial-scale SCSI
               controller data path is presented. This design is
               functionally compatible with an existing high performance
               commercial chip and meets the ANSI SCSI-2 standard.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/644/CSL-TR-94-644.pdf

%R CSL-TR-94-646
%Z Thu, 08 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Technology Mapping for VLSI Circuits Exploiting Boolean
               Properties and Operations
%A Mailhot, Frederic
%D December 1994
%X Automatic synthesis of digital circuits has gained increasing
               importance. The synthesis process consists of transforming an
               abstract representation of a system into an implementation in
               a target technology. The set of transformations has
               traditionally been broken into three steps: high-level
               synthesis, logic synthesis and physical design.
               This dissertation is concerned with logic synthesis. More
               specifically, we study technology mapping, which is the link
               between logic synthesis and physical design. The object of
               technology mapping is to transform a technology-independent
               logic description into an implementation in a target
               technology. One of teh key operations during technology
               mapping is to recognize logic equivalence between a portion
               of the initial logic description and an element of the target
               technology.
               We introduce new methods for establishing logic equivalence
               between two logic functions. The techniques, based on Boolean
               comparisons, use Binary Decision Diagrams (BDDs). An
               algorithm for dealing with completely specified functions is
               first presented. Then we introduce a second algorithm, which
               is applicable to incompletely specified functions. We also
               present an ensemble of techniques for optimizing delay, which
               rely on an iterative approach. All these methods have proven
               to be efficient both for run-times and quality of results,
               when compared to other existing technology mapping systems.
               The algorithms presented have been implemented in a
               technology mapping program, Ceres. Results are shown that
               highlight the apllication of the different algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/646/CSL-TR-94-646.pdf

%R CSL-TR-94-647
%Z Tue, 06 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design Issues in Floating-Point Division
%A Oberman, Stuart F.
%A Flynn, Michael J.
%D December 1994
%X Floating-point division is generally regarded as a low
               frequency, high latency operation in typical floating-point
               applications. However, the increasing emphasis on high
               performance graphics and the industry-wide usage of
               performance benchmarks, such as SPECmarks, forces processor
               designers to pay close attention to all aspects of
               floating-point computation. This paper presents the
               algorithms often utilized for floating-point division, and it
               also presents implementation alternatives available for
               designers. Using a system level study as a basis, it is shown
               how typical floating-point applications can guide the
               designer in making implementation decisions and trade-offs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/647/CSL-TR-94-647.pdf

%R CSL-TR-94-648
%Z Thu, 08 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic Synthesis of Gate-Level Speed-Independent Circuits
%A Beerel, Peter A.
%A Myers, Chris J.
%A Meng, Teresa H.-Y.
%D December 1994
%X This paper presents a CAD tool for the synthesis of robust
               asynchronous control circuits using limited-fanin basic gates
               such as AND gates, OR gates, and C-elements. The synthesized
               circuits are speed-independent; that is, they work correctly
               regardless of individual gate delays. Included in our
               synthesis procedure is an efficient procedure for logic
               optimizations using {\em observability don't cares} and {\em
               incremental verification}. We apply the procedure to a
               variety of specifications taken from industry and previously
               published examples and compare our speed-independent
               implementations to those generated using a
               non-speed-independent synthesis procedure included in
               Berkeley's SIS. Our implementations are not only more robust
               to delay variations since those produced by SIS rely on
               bounded delay lines to avoid circuit hazards but also are on
               average 13 percent faster with an area penalty of only 14
               percent.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/648/CSL-TR-94-648.pdf

%R CSL-TR-94-649
%Z Thu, 08 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Routing of Streams in WDM Reconfigurable Networks
%A Noronha, Ciro A., Jr.
%A Tobagi, Fouad A.
%D December 1994
%X Due to its low attenuation, fiber has become that medium of
               choice for point-to-point links. Using Wavelength-Division
               Multiplexing (WDM), many channels can be created in the same
               fiber. A network node equipped with a tunable optical
               transmitter can select any of these channels for sending
               data. An optical interconnection combines the signal from the
               various receivers in the network, and makes it available to
               the optical receivers, which may also be tunable. By properly
               tuning transmitters and/or receivers, point-to-point links
               can be dynamically created and destroyed. Therefore, in a WDM
               network, the routing algorithm has an additional degree of
               freedom compared to traditional networks: it can modify the
               network topology to create the routes. In this report, we
               consider the problem of routing audio/video streams in WDM
               networks. We present a general linear integer programming
               formulation for the problem. However, since this is a complex
               solution, we propose simpler heuristic algorithms, both for
               the unicast case and for the multicast case. The performance
               of these heuristics is evaluated in a number of scenarios,
               with a realistic traffic model, and from the evaluation we
               derive guidelines for usage of the heuristic algorithms.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/649/CSL-TR-94-649.pdf

%R CSL-TR-94-653
%Z Wed, 21 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Routing of Video/Audio Streams In Packet-Switched Networks
%A Noronha, Ciro A., Jr.
%D December 1994
%X The transport of multimedia streams in computer communication
               networks raises issues at all layers of the OSI model. This
               thesis considers some of the issues related to supporting
               multimedia streams at the network layer; in particular, the
               issue of appropriate routing algorithms. New routing
               algorithms, capable of efficiently meeting multimedia
               requirements, are needed.
               We formulate the optimum multipoint stream routing problem as
               a linear integer programming problem and propose an efficient
               solution technique. We show that the proposed solution
               technique significantly decreases the time to compute the
               solution, when compared to traditional methods.
               We use the optimum multicast stream routing problem as a
               benchmark to characterize the performance of existing
               heuristic algorithms under realistic network and traffic
               scenarios, and derive guidelines for using their usage and
               for upgrading the network capacity.
               We also consider the problem of routing multimedia streams in
               a Wavelength-Division Multiplexing (WDM) optical network,
               which has an additional degree of freedom over traditional
               networks - its topology can be changed by the routing
               algorithm to create routes as needed, by tuning optical
               transmitters and/or receivers. We show that the optimum
               reconfiguration and routing problem can formulated as a
               linear integer programming problem. Since this is a complex
               solution, we also propose a set of heuristic algorithms, both
               for unicast and multicast routing. We evaluate the
               performance of the proposed heuristics and derive guidelines
               for their usage.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/653/CSL-TR-94-653.pdf

%R CSL-TR-94-654
%Z Mon, 19 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Multipliers and Datapaths
%A Al-Twaijry, Hesham
%A Flynn, Michael J.
%D December 1994
%X People traditionally have considered the number of counters
               in the critical path as the metric for the performance of a
               multiplier. This report presents the view that tree
               topologies which have the least number of levels do not
               always give the fastest possible multiplier when constrained
               to be part of a microprocessor. It proposes two new
               topologies: hybrid structure and higher order arrays which
               are faster than conventional tree topologies for typical
               datapaths.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/654/CSL-TR-94-654.pdf

%R CSL-TR-94-655
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T I/O Characterization and Attribute Caches for Improved I/O
               System Performance
%A Richardson, Kathy J.
%D December 1994
%X Workloads generate a variety of disk I/O requests to access
               file information, execute programs, and perform computation.
               I/O caches capture most of these requests, reducing execution
               time, providing high I/O rates, and decreasing the disk
               bandwidth needed by each workload. A cache has difficulty
               capturing the full range of I/O behavior, however, when it
               treats the requests as single stream of uniform tasks. The
               single stream contains I/O requests for data with vastly
               different reuse rates and access patterns.
               Disk files can be classified as accesses to inodes,
               directories, datafiles or executables. The combined cache
               behavior of all four taken together provides few clues for
               improving performance of the I/O cache. But individually, the
               cache behavior of each reveals the distinct components that
               make up aggregate I/O behavior. Inodes and directories turn
               out to be small, highly reused files. Datafiles and
               executable files have more diverse characteristics. The
               smaller ones exhibit moderate reuse and have little
               sequential access, while the larger files tend to be accessed
               sequentially and not reused. Properly used, file type and
               file size information improves cache performance.
               The dissertation introduces attribute caches to improve I/O
               cache performance. Attribute caches use file attributes to
               selectively cache I/O data with a cache scheme tailored to
               the expected behavior of the file type. Inodes and
               directories are cached in very small blocks, capitalizing on
               their high reuse rate, and small space requirements. Large
               files are cached in large cache blocks capitalizing on their
               sequential access patterns. Small and medium sized files are
               cached in average 4 kbyte blocks that minimizes the memory
               required to service the bulk of requests. The portion of
               cache dedicated to each group varies with total cache size.
               This allows the important features of the workload to be
               captured at the appropriate cache size, and increases the
               total cache utilization. For a set of 11 measured workloads
               an attribute cache scheme reduced the miss ratio 25--60\%
               depending on cache size, and required only about 1/8 as much
               memory as a typical I/O cache implementation achieving the
               same miss ratio.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/655/CSL-TR-94-655.pdf

%R CSL-TR-94-656
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T I/O Characterization and Attribute Cache Data for Eleven
               Measured Workloads
%A Richardson, Kathy J.
%D December 1994
%X Workloads generate a variety of disk I/O requests to access
               file information, execute programs, and perform computation.
               Workload characterization is crucial to optimizing I/O system
               performance. This report contains detailed workload
               characterization data for eleven measured workloads. It
               includes numerous tables, and cache behavior plots for each
               workload.
               The workload I/O traces, from which the characterization is
               derived, include both file system information and I/O system
               information, where previous traces only included one or the
               other. The additional information allows I/O characterization
               at the system level, and greatly increases the body of
               knowledge about the make-up and type of disk I/O requested.
               The new information shows that the I/O request stream
               contains statistically diverse components that can be
               separated. This allows the important features of the workload
               to be captured at the appropriate cache size, and increases
               the total cache utilization.
               Note: This technical report is a companion report to the
               dissertation "I/O Characterization and Attribute Caches for
               Improved I/O System Performance" (CSL-TR-94-655). While the
               dissertation is self contained, this report is not; it
               presents data that is analyzed and discussed only in the
               dissertation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/656/CSL-TR-94-656.pdf

%R CSL-TR-94-624
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T WSIM: A Symbolic Waveform Simulator
%A Franco, Piero
%A McCluskey, Edward J.
%D June 1994
%X A symbolic waveform simulator is proposed in this report. The
               delay of faulty element is treated as a variable in the
               generation of the output waveform. Therefore, many timing
               simulations with different delay values do not have to be
               done to analyze the behavior of the circuit-under-test with
               the timing fault. The motivation for this work was to
               investigate delay testing by Output Waveform Analysis, where
               an accurate representation of the actual waveforms is
               required, although the simulator can be used for other
               applications as well (such as power analysis). Output
               Waveform Analysis will be briefly reviewed, followed by a
               description of both a simplified and a complete
               implementation of the waveform simulator, and simulation
               results.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/624/CSL-TR-94-624.pdf

%R CSL-TR-94-625
%Z Tue, 08 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An Experimental Chip to Evaluate Test Techniques Part 1:
               Description of Experiment
%A Franco, Piero
%A Stokes, Robert L.
%A Farwell, William D.
%A McCluskey, Edward J.
%D June 1994
%X A Test Chip has been designed and manufactured to evaluate
               different testing techniques for combinational or full-scan
               circuits. The Test Chip is a 25k gate CMOS gate-array using
               LSI Logic's LFT150K technology, and includes support (design
               for testability) circuitry and five types of
               circuits-under-test (CUT). Over 5,000 die have been
               manufactured.
               The five circuits-under-test include both datapath and
               synthesized control logic. The tests include design
               verification (simulation), exhaustive, pseudo-random, and
               deterministic vectors for various fault models (stuck-at,
               transition, delay faults, and IDDQ Testing). The chip will
               also be testing using the CrossCheck methodology, as well as
               other new technques, including Stability Checking and
               Very-Low-Voltage Testing. The experiment includes an
               investigation of both serial and parallel signature analysis.
               This report describes the Test Evaluation Chip Experiment,
               including the design of the Test Chip and the tests applied.
               A future report will cover the experimental results and data
               analysis.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/625/CSL-TR-94-625.pdf

%R CSL-TR-94-631
%Z Mon, 05 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T SimOS: A Fast Operating System Simulation Environment
%A Rosenblum, Mendel
%A Varadarajan, Mani
%D July 1994
%X In this paper we describe techniques for building a software
               development environment for operating system software. These
               techniques allow an operating system to be run at user-level
               on a general-purpose operating system such as System V R4
               Unix. The approach used in this work is to simulate a
               machine's hardware using services provided by the underlying
               operating system. We describe how to simulate the CPU using
               the operating system's process abstraction, the memory
               management unit using file mapping operations, and the I/O
               devices using separate processes. The techniques we present
               allow the simulator to run with sufficient speed and detail
               that workloads that exercise bugs on the real machine can be
               transferred and run in near real-time on the simulated
               machine. The speed of the simulation depends on the quantity
               and the cost of the simulated operations. Real programs
               usually run in the simulated environment at between 50% and
               100% of the speed of the underyling machine. The simulation
               detail we provide allows an operating system running in the
               simulated environment to be nearly indistinguishable from the
               real machine from a user perspective.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/631/CSL-TR-94-631.pdf

%R CSL-TR-94-617
%Z Thu, 05 Jan 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Fast Multiplication: Algorithms and Implementations
%A Bewick, Gary W.
%D April 1994
%X This thesis investigates methods of implementing binary
               multiplication with the smallest possible latency. The
               principle area of concentration is on multipliers with
               lengths of 53 bits, which makes the results suitable for
               IEEE-754 double precision multiplication.
               Low latency demands high performance circuitry, and small
               physical size to limit propagation delays. VLSI
               implementations are the only available means for meeting
               these two requirements, but efficient algorithms are also
               crucial. An extension to Booth's algorithm for multiplication
               (redundant Booth) has been developed, which represents
               partial products in a partially redundant form. This
               redundant representation can reduce or eliminate the time
               required to produce "hard" multiples (multiples that require
               a carry propagate addition) required by the traditional
               higher order Booth algorithms. This extension reduces the
               area and power requirements of fully parallel
               implementations, but is also as fast as any multiplication
               method yet reported.
               In order to evaluate various multiplication algorithms, a
               software tool has been developed which automates the layout
               and optimization of parallel multiplier trees. The tool takes
               into consideration wire and asymmetric input delays, as well
               as gate delays, as the tree is built. The tool is used to
               design multipliers based upon various algorithms, using both
               Booth encoded, non-Booth encoded and the new extended Booth
               algorithms. The designs are then compared on the basis of
               delay, power, and area.
               For maximum speed, the designs are based upon a 0.6mu BiCMOS
               process using emitter coupled logic (ECL). The algorithms
               developed in this thesis make possible 53x53 multipliers with
               a latency of less than 2.6 nanoseconds @ 10.5 Watts and a
               layout area of 13 mm@+[2]. Smaller and lower power designs
               are also possible, as illustrated by an example with a
               latency of 3.6 nanoseconds @ 5.8 W, and an area of 8.9
               mm@+[2]. The conclusions based upon ECL designs are extended
               where possible to other technologies (CMOS).
               Crucial to the performance of multipliers are high speed
               carry propagate adders. A number of high speed adder designs
               have been developed, and the algorithms and design of these
               adders are discussed.
               The implementations developed for this study indicate that
               traditional Booth encoded multipliers are superior in layout
               area, power, and delay to non-Booth encoded multipliers.
               Redundant Booth encoding further reduces the area and power
               requirements. Finally, only half of the total multiplier
               delay was found to be due to the summation of the partial
               products. The remaining delay was due to wires and carry
               propagate adder delays.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/617/CSL-TR-94-617.pdf

%R CSL-TR-94-650
%Z Mon, 12 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Uniform Approach to the Synthesis of Synchronous and
               Asynchronous Circuits
%A Myers, Chris J.
%A Meng, Teresa H.-Y.
%D December 1994
%X In this paper we illustrate the application of a synthesis
               procedure used for timed asynchronous circuits to the design
               of synchronous circuits. In addition to providing a uniform
               synthesis approach, our procedure results in circuits that
               are significantly smaller and faster than those designed
               using the synchronous design tool SIS.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/650/CSL-TR-94-650.pdf

%R CSL-TR-94-651
%Z Wed, 21 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic Hazard-Free Decomposition of High-Fanin Gates in
               Asynchronous Circuit Synthesis
%A Myers, Chris J.
%A Meng, Teresa H.-Y.
%D December 1994
%X In this paper we present an automated procedure to decompose
               high-fanin gates generated by asynchronous circuit synthesis
               procedures for technology mapping to practical gate
               libraries. Our procedure begins with a specification in the
               form of an event-rule system, a circuit implementation in the
               form of a production rule set, and a given gate library. For
               each gate in the implementation that has a fanin larger than
               the maximum in the library, a new signal is added to the
               specification. Each valid decomposition of the high-fanin
               gates using these new signals is examined by resynthesis
               until all gates have been successfully decomposed, or it has
               been determined that a solution does not exist. The procedure
               has been automated and used to decompose high-fanin gates
               from several examples generated by the synthesis tools ATACS
               and SYN. Our resulting implementations using ATACS, when
               compared with SIS which uses synchronous technology mapping
               and adds delay elements to remove hazards, are up to 50
               percent smaller and have less than half the latency using
               library delays generated by HSPICE.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/651/CSL-TR-94-651.pdf

%R CSL-TR-94-605
%Z Thu, 16 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance and Area Analysis of Processor Configurations
               with Scaling of Technology
%A Fu, Steve
%A Flynn, Michael J.
%D March 1994
%X The increasing density of transistors on integrated circuits
               and the increasing sensitivity toward costs have stimulated
               interest in developing techniques for relating transistor
               count to performance. This paper maps different processor
               configuration to transistor level area models and proposes an
               optimum evolution path of processor design as minimum feature
               size of technology is scaled. A parameter for measuring
               incremental performance improvement with respect to
               increasing transistor count is proposed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/605/CSL-TR-94-605.pdf

%R CSL-TR-94-657
%Z Mon, 13 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Instruction Level Parallel Processors---A New Architectural
               Model for Simulation and Analysis
%A Rudd, Kevin W.
%D December 1994
%X Trends in high-performance computer architecture have led to
               the development of increased clock-rate and dynamic
               multiple-instruction issue processor designs. There have been
               problems combining both these techniques due to the pressure
               that the complex scheduling and issue logic puts on the cycle
               time. This problem has limited the performance of
               multiple-instruction issue architectures. The alternative
               approach of static multiple-operation issue avoids the
               clock-rate problem by allowing the hardware to concurrently
               issue only those operations that the compiler scheduled to be
               issued concurrently. Since there is no hardware support
               required to achieve multiple-operation issue (there are
               multiple operations in a single instruction and the hardware
               issues a single instruction at a time), these designs can be
               effectively scaled to high clock rates. However, these
               designs have the problem that the scheduling of operations
               into instructions is rigid and to increase the performance of
               the system the entire system must be scaled uniformly so that
               the static schedule is not compromised. This report describes
               an architectural model that allows a range of hybrid
               architectures to be studied.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/657/CSL-TR-94-657.pdf

%R CSL-TR-94-652
%Z Wed, 29 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic Synthesis and Verification of Gate-Level Timed
               Circuits
%A Myers, Chris J.
%A Rokicki, Tomas G.
%A Meng, Teresa H.-Y.
%D December 1994
%X This paper presents a CAD system for the automatic synthesis
               and verification of gate-level timed circuits. Timed circuits
               are a class of asynchronous circuits which incorporate
               explicit timing information in the specification which is
               used throughout the synthesis procedure to optimize the
               design. This system accepts a textual specification capable
               of specifying general circuit behavior and timing
               requirements. This specification is systematically
               transformed to a graphical representation that can be
               analyzed using an exact and efficient timing analysis
               algorithm to find the reachable state space. From this state
               space, our synthesis procedure derives a timed circuit that
               is hazard-free using only basic gates to facilitate the
               mapping to semi-custom components, such as standard-cells and
               gate-arrays. The resulting gate-level timed circuit
               implementations are up to 40 percent smaller and 50 percent
               faster than those produced using other asynchronous design
               methodologies. We also demonstrate that our timed designs can
               be smaller and faster than their synchronous counterparts. To
               address verification, we have applied our timing analysis
               algorithm to verify efficiently not only our synthesized
               circuits but also a wide collection of reasonable-sized,
               highly concurrent timed circuits that could not previously be
               verified using traditional techniques.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/652/CSL-TR-94-652.pdf

%R CSL-TR-94-645
%Z Thu, 09 Feb 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Rationale, Design and Performance of the Hydra Multiprocessor
%A Olukotun, Kunle
%A Bergmann, Jules
%A Chang, Kun-Yung
%A Nayfeh, Basem A.
%D November 1994
%X In Hydra four high performance processors communicate via a
               shared secondary cache. The shared cache is implemented using
               multichip module (MCM) packaging technology. The Hydra
               multiprocessor is designed to efficiently support
               automatically parallelized programs that have high degrees of
               fine grained sharing. This paper motivates the Hydra
               multiprocessor design by reviewing current trends in
               architecture and development in parallelizing compiler
               technology and implementation technology. The design of the
               Hydra multiprocessor is described and explained. Initial
               estimates of the interprocessor communication latencies show
               them to be much better than current bus-based
               multiprocessors. These lower latencies result in higher
               performance on applications with fine grained parallelism.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/645/CSL-TR-94-645.pdf

%R CSL-TR-94-602
%Z Mon, 24 Apr 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Analyzing and Tuning Memory Performance in Sequential and
               Parallel Programs
%A Martonosi, Margaret Rose
%D January 1994
%X Recent architecture and technology trends have led to a
               significant gap between processor and main memory speeds.
               When cache misses are common, memory stalls can significantly
               degrade execution time. To help identify and fix such memory
               bottlenecks, this work presents techniques to efficiently
               collect detailed information about program memory performance
               and effectively organize the data collected. These techniques
               help guide programmers or compilers to memory bottlenecks.
               They apply to both sequential and parallel applications and
               are embodied in the MemSpy performance monitoring system.
               This thesis contends that the natural interrelationship
               between program memory bottlenecks and program data
               structures mandates the use of data oriented statistics, a
               novel approach that associates program performance
               information with application data structures. Data oriented
               statistics, viewed alone or paired with traditional code
               oriented statistics, offer a powerful, new dimension for
               performance analysis. I develop techniques for aggregating
               statistics on similarly-used data structures and for
               extracting intuitive source-code names for statistics. The
               thesis also argues that MemSpy's detailed statistics on the
               frequency and causes of cache misses are crucial in
               understanding memory bottlenecks. Common memory performance
               bugs are often most easily distinguished by noting the causes
               of their resulting cache misses.
               Since collecting such detailed information seems, at first
               glance, to require large execution time slowdowns, this
               dissertation also evaluates techniques to improve the
               performance of MemSpy's simulation-based monitoring. The
               first optimization, hit bypassing, improves simulation
               performance by specializing processing of cache hits. The
               second optimization, reference trace sampling, improves
               performance by simulating only sampled portions out of the
               full reference trace. Together, these optimizations reduce
               simulation time by nearly an order of magnitude. Overall,
               having used MemSpy to tune several applications, these
               experiences demonstrate that MemSpy generates effective
               memory performance profiles, at speeds competitive with
               previous, less detailed approaches.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/602/CSL-TR-94-602.pdf

%R CSL-TR-94-607
%Z Mon, 28 Nov 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Spreadsheets for Images
%A Levoy, Marc
%D February 1994
%X We describe a data visualization system based on
               spreadsheets. Cells in our spreadsheet contain graphical
               objects such as images, volumes, or movies. Cells may also
               contain graphical widgets such as buttons, sliders, or movie
               viewers. Objects are displayed in miniature inside each cell.
               Formulas for cells are written in a programming language that
               includes operators for array manipulation, image processing,
               and rendering. Formulas may also contain control structures,
               procedure calls, and assignment operators with side effects.
               Compared to flow chart visualization systems, spreadsheets
               are more expressive, more scalable, and easier to program.
               Compared to numerical spreadsheets, spreadsheets for images
               pose several unique design problems: larger formulas, longer
               computation times, and more complicated intercell
               dependencies. We describe an implementation based on the Tcl
               programming language and the Tk widget set, and we discuss
               our solutions to these design problems. We also point out
               some unexpected uses for our spreadsheets: as a visual
               database browser, as a graphical user interface builder, as a
               smart clipboard for the desktop, and as a presentation tool.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/607/CSL-TR-94-607.pdf

%R CSL-TR-94-616
%Z Wed, 03 May 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Reuse of High Precision Arithmetic Hardware to Perform
               Multiple Low Precision Calculations
%A Z ucker, Daniel
%A Lee, Ruby
%D April 1994
%X Many increasingly important applications, such as video
               compression, graphics, or multimedia, require only
               low-precision arithmetic. However, because the widespread
               adoption of the IEEE floating point standard has led to the
               ubiquity of IEEE double precision hardware, this double
               precision hardware is frequently used to do the low precision
               calculations. Naturally, it seems an inefficient use of
               resources to use 54 bits of hardware to perform an 8 or 12
               bit calculation.
               This paper presents a method for packing operands to perform
               multiple low precision arithmetic operations using regular
               high precision hardware. Using only source level software
               modification, a speedup of 15% is illustrated for the
               Discrete Cosine Transform. Since no machine-specific
               optimizations are required, this method will work on any
               machine that supports IEEE arithmetic. Finally, an analysis
               of speedup and suggestions for future work are presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/616/CSL-TR-94-616.pdf

%R CSL-TR-94-641
%Z Thu, 07 Dec 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Using Checking Experiments to Test Two-State Latches
%A Makar, Samy R.
%A McCluskey, Edward J.
%D November 1995
%X Necessary and sufficient conditions for an exhaustive
               functional test (checking experiment) of various latches are
               derived. These conditions are used to derive minimum-length
               checking experiments. The checking experiment for the D-latch
               is simulated using an HSpice implementation of the
               transmission gate latch. All detectable stuck-at, stuck-open,
               stuck-on, and bridging faults are detected.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/94/641/CSL-TR-94-641.pdf

%R CSL-TR-80-182
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design for autonomous test
%A McCluskey, Edward J.
%A Bozorgui-Nesbat, Saied
%D June 1981
%X A technique for modifying networks so that they are capable
               of self test is presented. The major innovation is
               partitioning the network into subnetworks with sufficiently
               few inputs that exhaustive testing of the subnetworks is
               possible.
               Procedures for reconfiguring the existing registers into
               modified linear feedback registers (LFSR's) which apply the
               exhaustive (not pseudo-random) test patterns or convert the
               responses into signatures are described. No fault models or
               test pattern generation programs are required. A method to
               modify CMOS circuits so that exhaustive testing can be used
               even when stuck-open faults must be detected is described. A
               detailed example using the 74181 ALU is presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/80/182/CSL-TR-80-182.pdf

%R CSL-TR-80-184
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design automation at Stanford II
%A vanCleemput, Willem M.
%D February 1980
%X This report contains a copy of the visual aids used by the
               authors during the presentation of their work at the Second
               Workshop on Design Automation at Stanford, held on Feb. 19,
               1980.
               The topics covered range from circuit level simulation and
               integrated circuit process modelling to high level languages
               and design techniques. The presentations are a survey of the
               activities in design automation at Stanford University.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/80/184/CSL-TR-80-184.pdf

%R CSL-TR-80-189
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Center-based broadcasting
%A Wall, David W.
%A Owicki, Susan S.
%D June 1980
%X We consider the problem of routing broadcast messages in a
               loosely-coupled store-and-forward network like the ARPANET.
               Dalal discussed a solution to this problem that minimizes the
               cost of a broadcast; in contrast, we are interested in
               performing broadcast with small delay. Existing algorithms
               can minimize the delay but seem unsuitable for use in a
               distributed environment because they involve a high degree of
               overhead in the form of redundant messages or data-structure
               space. We propose the schemes of center-based forwarding; the
               routing of all broadcasts via the shortest-path tree for some
               selected node called the center. These algorithms have small
               delay and also are easy to implement in a distributed system.
               To evaluate center-based forwarding, we define four measures
               of the delay associated with a given broadcast mechanism, and
               then propose three ways of selecting a center node. For each
               of the three forms of center-based forwarding we compare the
               delay to the minimum delay for any broadcasting scheme and
               also to the minimum delay for any single tree. In most cases,
               a given measure of the delay on the centered tree is bounded
               by a small constant factor relative to either of these two
               minimum delays. When it is possible, we give a tight bound on
               the ratio between the center-based delay and the minimum
               delay; otherwise we demonstrate that no bound is possible.
               These results give corollary bounds on how bad the three
               centered trees can be with respect to each other; most of
               these bounds are immediately tight, and the rest are replaced
               by better bounds that are also shown to be tight.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/80/189/CSL-TR-80-189.pdf

%R CSL-TR-80-192
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Verifying network protocols using temporal logic
%A Hailpern, Brent T.
%A Owicki, Susan S.
%D June 1980
%X Programs that implement computer communications protocols can
               exhibit extremely complicated behavior, and neither informal
               reasoning nor testing is reliable enough to establish their
               correctness. In this paper we discuss the application of
               program verification techniques to protocols. This approach
               is more reliable than informal reasoning, but has the
               advantage over formal reasoning based on finite-state models
               that the complexity of the proof does not grow unmanageably
               as the size of the program increases. Certain tools of
               concurrent program verification that are especially useful
               for protocols are presented: history variables that record
               sequences of input and output values, temporal logic for
               expressing properties that must hold in a future system state
               (such as eventual receipt of a message), and module
               specification and composition rules. The use of these
               techniques is illustrated by verifying a simple data transfer
               protocol from the literature.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/80/192/CSL-TR-80-192.pdf

%R CSL-TR-80-193
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A language for microcode description and simulation in VLSI
%A Hennessy, John L.
%D July 1980
%X This paper presents a programming language based system for
               specifying and simulating microcode in a VLSI chip. The
               language is oriented toward PLA implementation of microcoded
               machines using either a microprogram counter or a finite
               state machine. The system supports simulation of the
               microcode and will drive a PLA layout program to
               automatically create the PLA.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/80/193/CSL-TR-80-193.pdf

%R CSL-TR-81-201
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Research in VLSI systems design and architecture
%A Baskett, Forest
%A Clark, James
%A Hennessy, John
%A Owicki, Susan
%A Reid, Brian
%D March 1981
%X The Computer Systems Laboratory has been involved in a VLSI
               research program for one and a half years. The major areas
               under investigation have included: analysis and synthesis
               design aids, applications of VLSI to computer graphics, the
               design of a personal workstation, special purpose chip
               design, VLSI computer architectures, and hardware
               specification and verification. Progress on these research
               problems is discussed, and a research program for the next
               two years is proposed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/201/CSL-TR-81-201.pdf

%R CSL-TR-81-209
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Dynamic detection of concurrency in do-loops using ordering matrices
%A Wedig, Robert G.
%D May 1981
%X This paper describes the data structures and techniques used
               in dynamically detecting concurrency in Directly Executed
               Language (DEL) instruction streams. By dynamic detection, it
               is meant that these techniques are designed to be used at run
               time with no special source manipulation or preprocessing
               required to perform the detection.
               An abstract model of a concurrency detection structure called
               an ordering matrix is presented. This structure is used, with
               two other execution vectors, to represent the dependencies
               between instructions and indicate where potential concurrency
               exists.
               An algorithm is developed which utilizes the ordering matrix
               to detect concurrency within determinate DO-loops. It is then
               generalized to detect concurrency in arbitrary del
               instruction streams.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/209/CSL-TR-81-209.pdf

%R CSL-TR-81-214
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An exponential failure/load relationship: results of a multi-computer statistical study
%A Iyer, Ravishankar K.
%A Butner, Steven E.
%A McCluskey, Edward J.
%D July 1981
%X In this paper we present an exponential statistical model
               which relates computer failure rates to level of system
               activity. Our analysis reveals a strong statistical
               dependency of both hardware and software component failure
               rates on several common measures of utilization (specifically
               CPU utilization, I/O initiation, paging, and job-step
               initiation rates). We establish that this effect is not
               dominated by a specific component type, but exists across the
               board in the two systems studied. Our data covers three years
               of normal operation (including significant upgrades and
               reconfigurations) for two large Stanford University computer
               complexes. The complexes, which are composed of IBM mainframe
               equipment of differing models and vintage, run similar
               operating systems and provide the same interface and
               capability to their users. The empirical data domes from
               identically-structured and maintained failure logs at the two
               sites along with IBM OS/VS2 operating system performance/load
               records The statistically strong relationship between
               failures and load is evident for many equipment types,
               including electronic, mechanical, as well as software
               components. This is in opposition to the commonly-held belief
               that systems which are primarily electronic in nature exhibit
               no such effect to any significant degree. The exponential
               character of our statistical model is significantly not only
               in its simplicity, but also due to its compatibility with
               classical reliability techniques.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/214/CSL-TR-81-214.pdf

%R CSL-TR-81-219
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Consistency in interprocessor communications for fault-tolerant multiprocessors
%A Fu, Peter Lincoln
%D September 1981
%X Consistency among processors is vital for fault-tolerant
               multiprocessors. This report describes modular communication
               interprocessor interface units which implement distributed
               consistency schemes such that failures within a single
               processor module cannot affect the consistency of data
               transferred among the remaining processors. Furthermore, one
               scheme provides concurrent and consistent self-diagnosis data
               on the integrity of the units themselves. Another scheme is
               tolerant to almost all failures within two processor modules.
               The theory of the schemes are explained and their
               implementations in LSI circuits are described in detail. The
               interprocessor communication structure defined by any of
               these schemes serves well as a critical element in highly
               reliable multiprocessor systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/219/CSL-TR-81-219.pdf

%R CSL-TR-81-221
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Parametric curves, surfaces and volumes in computer graphics and 
                computer-aided geometric design
%A Clark, James H.
%D November 1981
%X This document has four purposes. It is a tutorial in
               parametric curve and surface representations, it describes a
               number of algorithms for generating both shaded and
               line-drawn pictures of bivariate surfaces and trivariate
               volumes, it explicitly gives transformations between all of
               the widely used curve and surface representations, and it
               proposes a solution to the problem of displaying the results
               of three-dimensional flow-field calculations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/221/CSL-TR-81-221.pdf

%R CSL-TR-81-223
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T MIPS: a VLSI processor architecture
%A Hennessy, John L.
%A Jouppi, Norman
%A Baskett, Forest
%A Gill, John
%D November 1981
%X MIPS is a new single chip VLSI processor architecture. It
               attempts to achieve high performance with the use of a
               simplified instruction set, similar to those found in
               microengines. The processor is a fast pipelined engine
               without pipeline interlocks. Software solutions to several
               traditional hardware problems, such as providing pipeline
               interlocks, are used.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/223/CSL-TR-81-223.pdf

%R CSL-TR-81-224
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Code generation and reorganization in the presence of pipeline constraints
%A Hennessy, John L.
%A Gross, Thomas
%D November 1981
%X Pipeline interlocks are used in a pipelined architecture to
               prevent the execution of a machine instruction before its
               operands are available. An alternative to this complex piece
               of hardware is to rearrange the instructions at compile-time
               to avoid pipeline interlocks. This problem, called code
               reorganization, is studied. The basic problem of
               reorganization of machine level instructions at compile-time
               is shown to be NP-complete. A heuristic algorithm is proposed
               and its properties and effectiveness are explored. The impact
               of code reorganization techniques on the rest of a compiler
               system are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/224/CSL-TR-81-224.pdf

%R CSL-TR-81-225
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic compiler code generation
%A Ganapathi, Mahadevan
%A Fischer, Charles N.
%A Hennessy, John L.
%D November 1981
%X A classification of automatic code generation techniques and
               a survey of the work on these techniques is presented.
               Automatic code-generation research is classified into three
               categories: formal treatments, interpretive approaches and
               descriptive approaches. An analysis of these approaches and a
               critique of automatic code-generation algorithms are
               presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/225/CSL-TR-81-225.pdf

%R CSL-TR-81-226
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T SILT: a VLSI design language
%A Davis, Tom
%A Clark, James
%D October 1982
%X SILT is an efficient, medium-level language to describe VLSI
               layout. Layout features are described in terms of a
               coordinate system based on the concept of relative geometry.
               SILT provides hierarchical cell description, a library format
               for parameterized cells with defaults for the parameters,
               constraint checking (but not enforcement), and some name
               control. It is designed to be used with a graphical
               interface, but can be used by itself.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/226/CSL-TR-81-226.pdf

%R CSL-TR-81-228
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Hardware/software tradeoffs for increased performance
%A Hennessy, John L.
%A Jouppi, Norman
%A Baskett, Forest
%A Gross, Thomas
%A Gill, John
%D February 1983
%X Most new computer architectures are concerned with maximizing
               performance by providing suitable instruction sets for
               compiled code, and support for systems functions. We argue
               that the most effective design methodology must make
               simultaneous tradeoffs across all three areas: hardware,
               software support, and systems support. Recent trends lean
               toward extensive hardware support for both the compiler and
               operating systems software. However, consideration of all
               possible design tradeoffs may often lead to less hardware
               support. Several examples of this approach are presented,
               including: omission of condition codes, word-addressed
               machines, and imposing pipeline interlocks in software. The
               specifics and performance of these approaches are examined
               with respect to the MIPS processor.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/81/228/CSL-TR-81-228.pdf

%R CSL-TR-82-229
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The SUN workstation architecture
%A Bechtolsheim, Andrew
%D March 1982
%X The Sun workstation is a personal computer system that
               combines graphics and networking capabilities with powerful
               local processing. The workstation has been developed for
               research in VLSI design automation, text processing,
               distributed operating systems and programming environments.
               Clusters of Sun workstations are connected via a local
               network sharing a network-based file system.
               The Sun workstation is based on a Motorola 6800 processor,
               has a 1024 by 800 pixel bitmap display, and uses Ethernet as
               its local network. The hardware supports virtual memory
               management, a "RasterOP" mechanism for high-speed display
               updates, and data-link-control for the Ethernet. The entire
               workstation electronics consists of 260 chips mounted on
               three 6.75 by 12 inch PC boards compatible with the IEEE 796
               Bus (Intel Multibus). In addition to implementing a
               workstation, the boards have been configured to serve as
               network nodes for file servers, printer servers, network
               gateways, and terminal concentrators.
               The report discusses the architecture and implementation of
               the Sun workstation, gives the background and goals of the
               project, contemplates future developments, and describes in
               detail its three main components: the processor, graphics,
               and Ethernet boards.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/82/229/CSL-TR-82-229.pdf

%R CSL-TR-82-230
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Packet-voice communication on an Ethernet local computer network: an experimental study
%A Gonsalves, Timothy A.
%D February 1982
%X Local computer networks have been used successfully for data
               applications such as file transfers for several years.
               Recently, there have been several proposals for using these
               networks for voice applications. This paper describes simple
               voice protocol for use on a packet-switching local network.
               This protocol is used in an experimental study of the
               feasibility of using a 3 Mbs experimental Ethernet network
               for packet-voice communications. This study shows that with
               appropriately chosen parameters the experimental Ethernet is
               capable of supporting about 40 simultaneous 64 Kbps voice
               conversations with acceptable quality. This corresponds to a
               utilization of 95% of the network capacity.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/82/230/CSL-TR-82-230.pdf

%R CSL-TR-82-231
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Dynamic detection of concurrency in DEL instruction streams
%A Wedig, Robert G.
%D February 1982
%X Detection of concurrency in Directly Executed Languages (DEL)
               is investigated. It is theorized that if DELs provide a
               minimal time -space execution of serial programs, then
               concurrency detection of such instruction streams approaches
               the minimum execution time possible for a single task without
               resorting to algorithm restructuring or source manipulation.
               It is shown how DEL encodings facilitate the detection of
               concurrency by allowing early decoding and explicity
               detection of dependency information. The decoding and
               dependency algorithms as applied to DELs are developed in
               detail. Concurrency structures are presented which facilitate
               the detection process. Since all concurrency is capable of
               exploitation as soon as it is known that the code is to be
               executed, i.e., the result of the branch is known, it is
               proven that all explicit parallelism can be detected and
               exploited using the techniques developed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/82/231/CSL-TR-82-231.pdf

%R CSL-TR-82-232
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Studies in microprocessor design
%A Alpert, Donald
%D June 1982
%X Microprocessor design practice is briefly surveyed. Examples
               are given for high-level and low-level tradeoffs in specific
               designs with emphasis on integrated memory functions. Some
               relations between architectural complexity and design are
               discussed, and a simple model is presented for implementing a
               RISC-like architecture. A direction for microprocessor
               architecture is proposed to allow flexibility for designing
               with varying processing technologies, cost goals, and
               performance goals.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/82/232/CSL-TR-82-232.pdf

%R CSL-TR-82-233
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Yale user's guide: a SILT-based layout editor
%A Davis, Tom
%A Clark, James
%D October 1982
%X YALE is a layout editor which runs on SUN workstations, and
               deals with cells expressed in the SILT language. It provides
               graphical hooks into many features describable in SILT. YALE
               runs under the V kernel, and makes use of a window manager
               than provides a multiple viewpoint capability.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/82/233/CSL-TR-82-233.pdf

%R CSL-TR-68-1
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T On the computational complexity of finite functions
%A Spira, Philip M.
%D May 1968
%X One of the most rapidly expanding fields of applied
               mathematics and engineering is automata theory. Although the
               term "automaton" is derived from "self-moving thing," the
               prime concern of automata theory is the study of
               information-processing devices. A specific example of
               information processing is computation, and thus the
               mathematical properties of devices which perform computations
               are of interest to automata theorists. In this thesis we
               investigate the computation by logic circuits of a certain
               class of functions having finite domain. To a given function
               f a number of so-called complexity criteria can be assigned
               relative to that class, e.g., the minimum computation time of
               or the minimum number of elements contained in any circuit of
               the class which is capable of computing f . Our prime
               criterion of interest will be computation time.
               The type of circuits investigated in this thesis are called
               (d,r ) circuits. A (d,r ) circuit is composed of logical
               elements each having at most r inputs and one output. Each
               input value and output value is an element from the set Z d =
               {0,1,...,d - 1}, and each element has unit delay in computing
               its output. Thus a given element computes a function from
               Z  S(k,d) to Z d , for some k 2 r, in unit time. The output of
               one element can be connected to inputs of any number of
               elements (including itself) and can also comprise one of the
               outputs of the circuit and an element receives a given one of
               its inputs either from the output of some element or from the
               inputs to the circuit. When individual elements are
               interconnected to form a (d,r) circuit, we can associate a
               computation time with the entire circuit.
               Specifically, let f : X1,...,Xn . Y be any function on finite
               sets X1,...,Xn. Let C be a (d,r) circuit whose input lines
               are partitioned into n sets. Let IC,j be the set of
               configurations of values from Z d on the jth (J = 1,2,...,n)
               and let OC be the set of output configurations of the
               circuit. Then C is said to compute f in time t if there are
               maps gj : Xj . IC,j (j = 1,2,...,n) and a 1 - 1 function h :
               Y . OC such that, if the input from time 0 through time t - 1
               is [g1(x1),...,gn(xn)], then the output of C at time t will
               be h(f(x1,...,xn)).
               Winograd has done pioneering work on the time of computation
               of finite functions by (d,r) circuits. He has derived lower
               bounds on computation time and has constructed near optimal
               circuits for many classes of finite functions.
               A principal contribution of this thesis is a complete
               determination of the time necessary to compute multiplication
               in a finite group with a (d,r) circuit. A new group theoretic
               quantity d(G) is defined whose reciprocal is the proper
               generalization of Winograd's a(G) to nonabelian groups. Then
               a novel method of circuit synthesis for group multiplication
               is given. In contrast to previous procedures, it is valid for
               any finite group--abelian or not. It is completely algebraic
               in character and is based upon our result that any finite
               group has a family of subgroups having a trivial intersection
               and minimum order d(G). The computation time achieved is, in
               all cases, at most one unit greater than our lower bound. In
               particular, if G is abelian our computation time is never
               greater--and often considerably less--than Winograd's.
               We then generalize the group multiplication procedure to a
               method to compute any finite function. For given sets X1, X2
               and Y and any family of subsets of Y having a certain
               property called completeness, a corresponding hierarchy of
               functions having domain X1 x X2 and range Y is established --
               the position of a function depending upon its computation
               time with our method. For reasons which we explain in the
               test, this appears to be a very natural classification
               criterion. At the bottom of the hierarchy are invertible
               functions such as numerical addition and multiplication, and
               the position of a function in the hierarchy depends
               essentially upon how far it is from being invertible. For
               large |X1| and |X2| almost all functions are near the top,
               corresponding to the fact that nearly all f : X1 x X2 . Y
               require computation time equal to the maximum required for
               any such function. The new method is then applied to the case
               of finite semigroup multiplication.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/68/1/CSL-TR-68-1.pdf

%R CSL-TR-70-11
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The SPOOF: a new technique for analyzing the effects of faults on logic networks
%A Clegg, Frederick W.
%D August 1970
%X In general, one cannot predict the effects of possible
               failures on the functional characteristics of a logic network
               without knowledge of the structure of that network.
               The SPOOF or structure-and parity-observing output function
               described in this report provides a new and convenient means
               of characterizing both network structure and output function
               in a single algebraic expression.
               A straightforward method for the determination of a SPOOF for
               any logic network is demonstrated. Similarities between
               SPOOF's and other means of characterizing network structure
               are discussed. Examples are presented which illustrate the
               ease with which the effects of any "stuck-at" fault - single
               or multiple - on the functional characteristics of a logic
               network are determined using SPOOF's.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/70/11/CSL-TR-70-11.pdf

%R CSL-TR-71-15
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Fault equivalence in sequential machines
%A Boute, Raymond
%A McCluskey, Edward J.
%D June 1971
%X This paper is concerned with the relationships among faults
               as they affect sequential machine behavior. Of particular
               interest are equivalence and dominancy relations.
               It is shown that for output faults (i.e., faults that do not
               affect state behavior), fault equivalence is related to the
               existence of an automorphism of the state table. For the same
               class of faults, the relation between dominance and
               equivalence is considered and some properties are pointed
               out. Another class of possible faults is also considered,
               namely, memory faults (i.e., faults in the logic feedback
               lines). These clearly affect the state behavior of the
               machine, and their influence on machine properties, such as
               being strongly connected, is discussed. It is proven that
               there exist classes of machines for which this property of
               being strongly connected is destroyed by every possible
               single fault. Further results on both memory and output
               faults are also presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/71/15/CSL-TR-71-15.pdf

%R CSL-TR-71-24
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An improved reliability model for NMR
%A Siewiorek, Daniel P.
%D December 1971
%X The classical reliability model for N-modular redundancy
               (NMR) assumes the network to be failed when a majority of
               modules which drive the same voter fail. It has long been
               known that this model is pessimistic since there are
               instances, termed compensating module failures, where a
               majority of the modules fail but the network is nonfailed. A
               different module reliability model based on lead reliability
               is proposed which has the classical NMR reliability model as
               a special case. It is shown that the standard procedure for
               altering the classical model to take compensating module
               failures into account may predict a network reliability which
               is too low in some cases and too high in others. It is also
               demonstrated that the improved model can increase the
               predicted mission time (the time the system is to operate at
               or above a given reliability) by 50% over the classical model
               prediction for a simple network.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/71/24/CSL-TR-71-24.pdf

%R CSL-TR-72-30
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Adaptive design methods for checking sequences
%A Boute, Raymond T.
%D July 1972
%X The length of checking sequences for sequential machines can
               be considerably reduced if, instead of preset distinguishing
               sequences, one uses so-called distinguishing sets of
               sequences, which serve the same purpose, but are generally
               shorter. The design of such a set turns out to be equivalent
               to the design of an adaptive distinguishing experiment,*
               though a checking sequence, using a distinguishing set,
               remains essentially preset. This property also explains the
               title.
               All machines having preset distinguishing sequences also have
               distinguishing sets. In case no preset distinguishing
               sequences exist, most of the earlier methods call for the use
               of locating sequences, which result in long checking
               experiments. However, in many of these cases, a
               distinguishing set can be found, thus resulting in even more
               savings in length.
               Finally, the characterizing sequences used in locating
               sequences can also be adaptively designed, and thus the basic
               idea presented below is advantageous even when no
               distinguishing sets exist.
                By "experiment" we mean the
               application of sequence(s) to the machine while observing the
               output. In some instances, the words "experiment" and
               "sequence" can be used interchangeably.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/72/30/CSL-TR-72-30.pdf

%R CSL-TR-72-35
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Separate non-homomorphic checking codes for binary addition
%A Kolupaev, Stephen G.
%D July 1972
%X In this paper, necessary and sufficient conditions for
               successful detection of errors in a binary adder by any
               separate code are developed. We demonstrate the existence of
               separate checking codes for addition modulo $2^n$ (n >= 4)
               and modulo $2^n$-1 (n > 5, n even), which are not homomorphic
               images of the addition being checked. A non-homomorphic code
               is constructed in a regular fashion from a single check
               symbol with special properties. Finding all such intial check
               symbols requires an exhaustive search of a large tree, and
               results indicate that the number of distinct codes for a
               particular modulus grows rapidly with n. In an appendix, we
               examine a modulo $2^n$ adder where the carry out of the high
               position is also presented to a checker.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/72/35/CSL-TR-72-35.pdf

%R CSL-TR-72-36
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design of a parallel encoder/decoder for the Hamming code, using ROM
%A Mitarai, H.
%A McCluskey, E. J.
%D June 1972
%X ROM implementation of logic circuits which have a large
               number of inputs in generally considered unwise. However, in
               the design of an encoder/decoder for the Hamming code, ROM
               implementation is found to yield many advantages over SSI and
               MSI implementation. There is a one-to-one correspondence
               between the partition of H matrix into submatrices and the
               partition of the set of the inputs to the encoder into
               subsets of the inputs to the ROM modules. Hence, several
               methods of partitioning the H matrix for the Hamming code are
               devised. The resulting ROM implementation is shown to save
               package count compared with other implementations. However,
               at the present state of technology, there is a trade-off
               between speed and package count. In the applications where
               speed is of the utmost importance, the SSI implementation
               using ECL logic is the most attractive. The disadvantage of
               ROM in speed should diminish in the near future when
               semiconductor memory technology will progress to the point
               where the slow DTL/TTL gates in the input buffer, the address
               decoder, and the output buffer of ROM, can be replaced by
               faster gates.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/72/36/CSL-TR-72-36.pdf

%R CSL-TR-73-49
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Self-testing residue trees
%A Kolupaev, Stephen G.
%D August 1973
%X Error detection and correction in binary adders often require
               computing the residue modulo A of a binary number. We present
               here a totally self-checking network which extracts the
               residue of a binary input number of arbitrary width, with
               respect to any odd modulus A. This network has the tree
               structure commonly used for residue extraction: a binary tree
               of circuit blocks, where each block outputs the residue of
               its inputs. The network we describe differs from previous
               designs in that the signals between blocks of the tree are
               not binary-coded. Instead, the l-out-of-A code is used, where
               A is the modulus desired. Use of this code permits the
               network to be free of inverters, giving it an advantage in
               speed. The network output is also coded l-out-of-A, and with
               respect to this code, the residue tree is totally
               self-checking in the sense of Anderson.
               The residue tree described here requires logic gates with A
               inputs, when the modulus desired is A . This makes the basic
               design somewhat impractical for a large modulus, because
               gates with large fan-in are undesirable. To extend the
               usefulness of this network, we present a technique which uses
               several residue trees of this design, each for a different
               modulus. The outputs of these residue trees are combined by a
               totally self-checking translator from the code of multiple
               residues to the l-out-of-A code. Using this multiple residue
               scheme, the modulus of each residue tree can be made much
               smaller than the desired modulus A.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/73/49/CSL-TR-73-49.pdf

%R CSL-TR-73-52
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Hazards in asynchronous systems
%A Chewning, D. R.
%A Bredt, Thomas H.
%D September 1972
%X Necessary and sufficient conditions are given for the
               existence of static and dynamic hazards in combinational
               circuits that undergo multiple input changes. These theorems
               are applied in the analysis of modules, such as the wye
               module, that have been proposed for asynchronous systems. We
               show that unless internal module delays are strictly less
               than delays between modules, incorrect operation can occur
               due to hazards in module implementations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/73/52/CSL-TR-73-52.pdf

%R CSL-TR-73-56
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T Reliability modeling of NMR networks
%A Abraham, Jacob
%A Siewiorek, Daniel P.
%D June 1973
%X A survey of the literature in the area of redundant system
               reliability modeling is presented with special emphasis on
               Triple Modular Redundancy (TMR). Areas where the classical
               method of TMR reliability prediction may prove inadequate are
               identified, like the interdependence of fault patterns at
               points of network fan-in and fan-out. This is especially true
               if the assumption of highly reliable subsystems, which is
               frequently made by the modeling techniques, is dropped. It is
               also not clear if the methods give an upper or a lower bound
               to the reliability. As a solution, a method of partitioning
               an arbitrary network into cells so that the faults in a cell
               are independent of faults in other cells is proposed. An
               algorithm is then given to calculate a tight lower bound on
               the reliability of any such cell, by considering only the
               structure of the interconnections within the cells. The value
               of reliability found is exact if TMR is assumed to be a
               coherent system. An approximation to the algorithm is also
               described; this can be used to find a lower bound to the
               reliability without extensive calculation. Modifications to
               the algorithm to improve it and to take care of special cases
               are given. Finally, the algorithm is extended to N-Modular
               Redundant (NMR) networks.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/73/56/CSL-TR-73-56.pdf

%R CSL-TR-73-62
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A highly efficient redundancy scheme: self-purging redundancy
%A Losq, Jacques
%D July 1975
%X The goals of this paper are to present an efficient
               redundancy scheme for highly reliable systems, to give a
               method to compute the exact reliability of such schemes and
               to compare this scheme with other redundancy schemes. This
               redundancy scheme is self-purging redundancy; a scheme that
               uses a threshold voter and that purges the failed modules.
               Switches for self-purging systems are extremely simple: there
               is no replacement of failed modules and module purging is
               quite simply implemented. Because of switch simplicity, exact
               reliability calculations are possible. The effects of switch
               reliability are quantitatively examined. For short mission
               times, switch reliability is the most important factor:
               self-purging systems have a probability of failure several
               times larger than the figure obtained when switches are
               assumed to be perfect. The influence of the relative
               frequency of the diverse types of failures (permanent,
               intermittent, stuck-at,...) are also investigated.
               Reliability functions, mission time improvements and switch
               efficiency are displayed. Self-purging systems are compared
               with ot her redundant systems, like hybrid or NMR, for their
               relative merits in reliability gain, simplicity, cost and
               confidence in the reliability estimation. The high confidence
               in the reliability evaluation of self-purging systems makes
               them a standard for the validation of several models that
               have been proposed to take into account switch reliability.
               The accuracy of models using coverage factors can be
               evaluated that way.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/73/62/CSL-TR-73-62.pdf

%R CSL-TR-74-66
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Computer system performance measurement: instruction set processor 
              level and microcode level
%A Svobodova, Liba
%D June 1974
%X Techniques based on hardware monitoring were developed to
               measure computer system performance on the instruction set
               processor level and the microcode level.
               Knowledge of system behavior and system utilization at these
               two levels is extremely valuable for design of new
               processors. The reasons why such information is needed are
               discussed and applicable measurement techniques for obtaining
               necessary data are reviewed. A hardware monitor is a
               preferable measurement tool since it can trace most of the
               significant events attributed to these two levels without
               introducing any artifact.
               Described hardware monitoring techniques were implemented on
               the S/370 Model 145 at Stanford University. Measurements
               performed on the instruction set processor level were
               concerned with determining execution frequencies on
               individual instructions under normal system workload. The
               microcode level measurements measured the number and the type
               of S/370 Model 145 microwords executed in the process of
               interpretation of an individual S/370 instruction and the
               average execution time of each such instruction.
               Implementation of each technique is described and the results
               based on the outcome of performed measurements are presented.
               Finally, effectiveness and ease of use of the discussed
               techniques are considered.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/74/66/CSL-TR-74-66.pdf

%R CSL-TR-74-75
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Influence of fault-detection and switchinig mechanisms on the reliability of stand-by systems
%A Losq, Jacques
%D July 1975
%X This paper concerns the reliability of stand-by systems when
               switch reliability is taken into account. It is assumed that
               failures obey a Poisson distribution for modules and
               switches. A very detailed method is given to model stand-by
               systems. Several cases are investigated: ideal systems, real
               systems with fault-detection mechanisms that can detect any
               module error and systems for which the fault-detection
               mechanisms detect only some of the module errors. The
               reliability versus time curves are determined for each value
               of the number of spares. It is shown that the best number of
               spares increases as the length of the mission increases.
               Systems with extremely short mission time have the best
               reliability when they have only one spare. The limit when the
               number of spares increases is the reliability obtained with
               simplex systems. Whatever the number of spares is, the
               reliability of stand-by systems goes to zero as time goes to
               infinity. For a given mission time, it is possible to
               determine the best number of spares and the best possible
               reliability. For a given reliability, it is possible to
               compute the number of spares that gives the longest mission
               time. These models can be used to determine whether or not
               there exists a stand-by system that meets the requirements of
               a given reliability and a given mission time. If such
               stand-by system exists, its characteristics (minimum number
               of spares and reliability) can be derived.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/74/75/CSL-TR-74-75.pdf

%R CSL-TR-74-77
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Parallel solution methods for triangular linear systems of equations
%A Orcutt, Samuel E.
%D June 1974
%X In this paper we consider developing parallel solution
               methods for triangular linear systems of equations. For a
               system of N equations in N unknowns the serial method
               requires O(N2) steps, and the straightforward parallel method
               requires steps and O(N) processors. In this paper we develop
               methods that require O(log N) time when used with O(N3)
               processors and O( R(N) log N) time when used with O(N2)
               processors. We also consider solutions to band triangular
               systems and develop a method that requires O((log N) (log m))
               time and O(Nm2) processors, where m is the bandwidth of the
               system.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/74/77/CSL-TR-74-77.pdf

%R CSL-TR-74-85
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The solution of large multi-dimensional Poisson problems
%A Stone, Harold S.
%D May 1974
%X The Buneman algorithm for solving Poisson problems can be
               adapted to solve large Poisson problems on computers with a
               rotating drum memory so that the computation is done with
               very little time lost due to rotational latency of the drum .
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/74/85/CSL-TR-74-85.pdf

%R CSL-TR-75-92
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The complexity of control structures and program validation
%A Davison, Joseph W.
%D May 1975
%X A preliminary examination of the influence of control
               structures on the complexity of the proof of correctness of
               computer programs. A block structured proof technique is
               defined and studied. Two parameters affecting the complexity
               of the proof are defined; the number of exits from a block,
               and the cycle rank of a block, a measure of loop complexity.
               Proof complexity classes of flowcharts are defined, with
               maximum values for these parameters. The question
               investigated is: How does restricting the complexity affect
               the class of functions realizable, assuming a given set of
               primitive actions and predicates: It is found that loop
               complexity may be traded for exits, and that for a given
               number of exits there are functions requiring any specific
               loop complexity. Further, it is shown that blocks with two
               exits are considerably more powerful than those with only
               one. In fact, for a given maximal loop complexity, there are
               functions that cannot be realized with one-exit blocks, but
               can be realized with two-exit blocks, even if the loop
               complexity is restricted to essentially one internal loop per
               block. Looking at it the other way around, the addition of a
               second exit to a block allows construction of flowcharts with
               any specified loop complexity. This result appears to be
               extendable to blocks with more exits, but this has not been
               completed.
               The work is primarily of a graph theoretical nature, and may
               also be interpreted as an examination of sequential control
               structures from the point of view of feedback loop
               complexity.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/75/92/CSL-TR-75-92.pdf

%R CSL-TR-75-93
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Sequential circuit output probabilities from regular expressions
%A Parker, Kenneth P.
%A McCluskey, Edward J.
%D June 1975
%X This paper presents a number of methods for finding
               sequential circuit output probabilities using regular
               expressions. Various classes of regular expressions, based on
               their form, are defined and it is shown how to easily find
               multistep transition probabilities directly from the regular
               expressions. A new procedure for finding steady state
               probabilities is given which proceeds either from a regular
               expression or a state diagram description. This procedure is
               based on the concept of synchronization of the related
               machine, and is useful for those problems where
               synchronization sequences exist. In the cases where these
               techniques can be utilized, substantial savings in
               computation can be realized. Further, application to other
               areas such as multinomial Markov processes is immediate.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/75/93/CSL-TR-75-93.pdf

%R CSL-TR-75-95
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The stack working set: a characterization of spatial locality
%A Rau, B. Ramakrishna
%D July 1975
%X Multilevel memory hierarchies are attractive from the point
               of view of cost-performance. However, they present far
               greater problems than two-level hierarchies when it comes to
               analytic performance evaluation. This may be attributed to
               two factors: firstly, the page size (or the unit of
               information transfer between two levels) varies with the
               level in the hierarchy; secondly, the request streams that
               the lower (slower) levels see are the fault streams out of
               the immediately higher levels. Therefore, the request stream
               seen by each level is not necessarily the same as the one
               generated by the processor. Since the performance depends
               directly upon the properties of the request stream, this
               poses a problem.
               A model for program behavior, which explicitly characterizes
               the spatial locality of the program, is proposed and
               validated. It is shown that the spatial locality of a program
               is an invariant of the hierarchy when characterized in this
               manner. This invariance is used to solve the first problem
               stated - that of the varying page sizes. An approximate
               technique is advanced for the characterization of the fault
               stream as a function of the request stream and the capacity
               of the level. A procedure is then outlined for evaluating the
               performance of a multilevel hierarchy analytically.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/75/95/CSL-TR-75-95.pdf

%R CSL-TR-75-96
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A rollback interval for networks with an imperfect self-checking property
%A Shedletsky, John J.
%D December 1975
%X Dynamic self-checking is a technique used in computers to
               detect a fault quickly before extensive data contamination
               caused by the fault can occur. When the self-checking
               properties of the computer circuits are not perfect, as in
               the case with self-testing only and partially self-checking
               circuits, the recovery procedure may be required to roll back
               program execution to a point prior to the first undetected
               data error caused by the detected fault.
               This paper presents a method by which the rollback distance
               required to achieve a given probability of successful data
               restoration may be calculated. To facilitate this method,
               operational interpretations are given to familiar network
               properties such as the self-testing, secureness, and
               self-checking properties.
               An arithmetic and logic unit with imperfect self-checking
               capability is analyzed to determine the minimum required
               rollback distance for the recovery procedure.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/75/96/CSL-TR-75-96.pdf

%R CSL-TR-75-97
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Deterministic sequential networks under random control
%A Varszegi, Sandor
%D September 1975
%X This paper presents a network-oriented approach for the
               treatment of deterministic sequential networks under random
               control. Considered are the cases of multinomial, stationary
               Markov and arbitrary input processes. Probabilities of the
               state and output processes are directly derived from the
               primary information of the network and the source. Coded
               networks are treated using the logic circuits or Boolean
               functions. The isomorphism between Boolean and event algebras
               is made use of, and the probabilities of the response
               processes are obtained in the form of algebraic probability
               expressions interpreted over the determining (i.e., input and
               initial state) minterm or signal joint probabilities.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/75/97/CSL-TR-75-97.pdf

%R CSL-TR-75-98
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A tale of three emulators
%A Hoevel, Lee W.
%A Wallach, Walter A. Jr.
%D November 1975
%X This is a preliminary report on the development of emulator
               code for the Stanford EMMY.
               Emulation is introduced as an interpretive computing
               technique. Various classes of emulation and their correlation
               to the image machine are presented.
               Functional and structural overviews of three emulators for
               the Stanford EMMY are presented. These are IBM System/360;
               CRIL; and DELtran. Performance estimates are included for
               each of these systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/75/98/CSL-TR-75-98.pdf

%R CSL-TR-75-101
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A new philosophy for wire routing
%A Rau, B. Ramakrishna
%D November 1975
%X A number of interconnection algorithms exist and have been
               used quite successfully. However, most of them, though
               differing in detail, appear to subscribe to the same
               underlying philosophy which has developed from that for
               single layer boards. Arguments are advanced which question
               the validity of this philosophy in this environment of
               multilayer board technology. A new philosophy is developed in
               this report, which, it is hoped, will be more suited for use
               with multilayer boards. Based on this philosophy, an
               interconnection algorithm is then developed in a step by step
               fashion.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/75/101/CSL-TR-75-101.pdf

%R CSL-TR-75-102
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T High performance emulation
%A Wallach, Walter A. Jr.
%D November 1975
%X The Stanford EMMY is examined as an emulation engine. Using
               the 360 emulator and the DELtran interpreter as examples, the
               performance of the current EMMY architecture is examined as a
               high performance emulation vehicle. The problems of using a
               sequential, vertically organized processor for high speed
               emulation are developed and discussed.
               A flexible control structure for high speed emulation studies
               is derived from an existing high performance processor. This
               structure issues a stream of microinstructions to a central
               command bus, allowing user-defined execution resources to
               execute them in overlapped fashion. These execution resources
               may be added or deleted with little or no processor rewiring.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/75/102/CSL-TR-75-102.pdf

%R CSL-TR-76-106
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Mathematical models for the circuit layout problem
%A vanCleemput, Willem M.
%D February 1976
%X In the first part of this paper the basic differences between
               the classical (placement, routing) and the topological
               approach to solving the circuit layout problem are outlined.
               After a brief survey of some existing mathematical models for
               the problem, an improved model is suggested. This model is
               based on the concept of partially oriented graph and contains
               more topological information than earlier models.
               This reduces the need for special constraints on the graph
               embedding algorithm. The models also allow pin and gate
               assignment in function of the layout, under certain
               conditions.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/106/CSL-TR-76-106.pdf

%R CSL-TR-76-108
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Cascade structure in totally self-checking networks
%A Kolupaev, Stephen
%D April 1976
%X In the well-known totally self-checking (TSC) network, a
               failure must not change one output codeword into another.
               Called the fault-secure property, this permits a receiver of
               the net's output to assume that any codeword it receives is
               correct. Further, the self-testing property requires that
               each possible failure in the net must produce at least one
               non-code output. Thus a receiver can monitor the health of
               the network by watching for non-code outputs.
               In this paper we propose modifications of these two
               properties. The self-testing property is made more stringent.
               Each possible failure in the net is required to produce an
               output which is in a distinguished subset of the non-code
               outputs. The fault-secure requirement is modified to permit a
               fault to interchange certain output codewords. In particular,
               all outputs not in the distinguished subset are partitioned
               into equivalent classes, and a fault is permitted to change
               the output from one codeword to another codeword in the same
               class. However, a fault is not permitted to change the output
               from a codeword to any member of a different equivalence
               class (one not containing the correct output) .
               These modified properties define a generalization of the TSC
               network. A network which meets the modified properties is
               called a generalized self-checking (GSC) network.
               Self-checking and self-testing (Morphic) networks and TSC
               networks are special cases of the GSC network.
               Examining TSC networks, we find a further connection with the
               GSC network. It has been known for some time that not every
               subnetwork of a TSC network need by TSC. We show that every
               subnetwork of a TSC network is GSC, and every TSC network is
               a cascade of GSC networks. This establishes the GSC network
               as the basic building block from which every TSC network is
               constructed.
               We explore a brute-force method for constructing a desired
               TSC network by cascading GSC subnetworks. The method resorts
               to enumeration at many points of decision and thus is not a
               practical design tool. However, it does yield a very nice
               alternate realization of the Morphic OR, and suggests
               specializations which merit further study.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/108/CSL-TR-76-108.pdf

%R CSL-TR-76-111
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A distributed algorithm for constructing minimal spanning trees in
                 computer-communication networks
%A Dalal, Yogen K.
%D June 1976
%X This paper presents a distributed algorithm for constructing
               minimal spanning trees in computer-communication networks.
               The algorithm can be executed concurrently and asynchronously
               by the different computers of the network. This algorithm is
               also suitable for constructing minimal spanning trees using a
               multiprocessor computer system. There are many reasons for
               constructing minimal spanning trees in computer-communication
               networks since minimal spanning tree routing is useful in
               distributed operating systems for performing broadcast, in
               adaptive routing algorithms for transmitting delay estimates,
               and in other networks like the Packet Radio Network.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/111/CSL-TR-76-111.pdf

%R CSL-TR-76-113
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Error correction by alternate-date retry
%A Shedletsky, John J.
%D May 1976
%X A new technique for low-cost error correction in computers is
               the alternate-data retry, (ADR). An ADR is initiated by the
               detection of an error in the initial execution of an
               operation. The ADR is a re-execution of the operation, but
               with an alternate representation of the initial data. The
               choice of the alternate representation and the design of the
               processing circuits combine to insure that even an error due
               to a permanent fault is not repeated during retry.
               Error-correction is provided at a hardware cost comparable to
               that of a conventional retry capability.
               Sufficient conditions are given for the design of circuits
               with an ADR capability. The application of an ADR capability
               to memory and to the data paths of a processor is
               illustrated.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/113/CSL-TR-76-113.pdf

%R CSL-TR-76-114
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T EMMY/360 functional characteristics
%A Wallach, Walter A. Jr.
%D June 1976
%X An emulation of the IBM System/360 architecture is presented
               - the EMMY/360. Problem state code which executes correctly
               on an IBM 360 will also execute correctly on the EMMY/360.
               Code producing execution exceptions will, in most cases,
               produce the same results on the two systems. Certain
               exceptions occurring on IBM 360 cannot occur on the EMMY/360,
               such as address specification exceptions for main store
               operands, and certain precise interrupts on IBM 360 will be
               imprecise on the EMMY/360, such as address exceptions. The
               EMMY/360 supports the Standard 360 instruction set with
               single precision floating point. The 360 input/output
               structure is not supported; I/O on the EMMY system is done by
               Function Call instruction, rather than channel program and
               Start-Test I/O.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/114/CSL-TR-76-114.pdf

%R CSL-TR-76-115
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Principles of self-checking processor design and an example
%A Wakerly, John F.
%D December 1975
%X A self-checking processor has redundant hardware to insure
               that no likely failure can cause undetected errors and all
               likely failures are detected in normal operation. We show how
               error-detecting codes and self-checking circuits can be used
               to achieve these properties in a microprogrammed processor.
               The choice of error-detecting codes and the placement of
               checkers to monitor coded data paths are discussed. The use
               of codes to detect errors in arithmetic and logic operations
               and microprogram control units is described. An example
               processor design is given and some observations on the
               diagnosis and repair of such a processor are made. From the
               example design it appears that somewhat less than 50% overall
               redundancy is required to guarantee the detection of all
               failures that affect a single medium- or large-scale
               integration circuit package.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/115/CSL-TR-76-115.pdf

%R CSL-TR-76-116
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Asynchronous serial interface for connecting a PDP-11 to the ARPANET (BBN 1822)
%A Crane, Ronald C.
%D July 1976
%X This report describes an interface to permit the connection
               of any PDP-11 to either the Packet radio network of the
               ARPAnet. The interface connects to an IMP on one side,
               meeting the specifications published in BBN report number
               1822, NS RO 16 bit parallel interface (DRV-11 or DR11-C) as
               described in the DEC peripherals and interfacing handbook.
               The interface card itself is a double height board (5.2"x
               8.5") which can be plugged into any peripheral slot in a
               PDP-11 backplane. The interface card is connected to the
               parallel interface card via two cables with Berg 40 pin
               connectors (DEC H-856) and to the IMP via an Amphenol bayonet
               connector (48-10R-18-315). All 3 cables and connectors are
               supplied with the I/O interface card. The parallel interface
               card (DEC DR11-C or DRV-11) together with the special I/O
               interface card described in this report comprise the 1822
               interface. The report includes descriptions of the operation
               of circuits, programming, and diagnostics for the 1822
               interface.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/116/CSL-TR-76-116.pdf

%R CSL-TR-76-117
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An "almost-exact" solution to the N-processor, M-memory bandwidth problem
%A Rau, B. Ramakrishna
%D June 1976
%X A closed-form expression is derived for the memory bandwidth
               obtained when N processors are permitted to generate requests
               to M memory modules. Use of generating functions is made, in
               a rather unusual fashion, to obtain this expressio n. The one
               approximation involved is shown to result in only a very
               small error -- and that, too, only for small values of M and
               N. This expression, which is asymptotically exact, is shown
               to be more accurate than existing closed form approximations.
               Lastly, a family of asymptotically exact solutions are
               presented which are easier to evaluate than is the first one.
               Although these expressions are less accurate than the
               previously derived closed-form solution, they are,
               nevertheless, better than existing solutions.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/117/CSL-TR-76-117.pdf

%R CSL-TR-76-118
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Stanford emulation laboratory
%A Flynn, Michael J.
%A Hoevel, Lee W.
%A Neuhauser, Charles J.
%D June 1976
%X The Stanford Emulation Laboratory is designed to support
               general research in the area of emulation. Central to the
               laboratory is a universal host machine, the EMMY, which has
               been designed specifically to be an unbiased, yet efficient
               host for a wide range of target machine architectures.
               Microstore in the EMMY is dynamically microprogrammable and
               thus is used as the primary data storage resource of the
               emulator. Other laboratory equipment includes a
               reconfigurable main memory system and an independent control
               processor to monitor emulation experiments. Laboratory
               software, including two microassemblers, is briefly
               described.
               Three laboratory applications are described: (1) A
               conventional target machine emulation (a system 360), (2)
               'microscopic' examination of emulated target machine
               I-streams, and (3) Direct execution of a high level language
               (Fortran II).
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/118/CSL-TR-76-118.pdf

%R CSL-TR-76-119
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A simulator for the evaluation of digital system reliability
%A Thompson, Peter Alan
%D August 1977
%X This report describes a simulation package designed to
               evaluate the reliability of digital systems. The simulator
               can be used to model many different types of systems, at
               varying levels of detail. The user is given much freedom to
               use the elements of the model in the way best suited to
               simulating the operation of a system in the presence of
               faults. The simulation package then generates random faults
               in the model, and uses a Monte Carlo analysis to obtain
               curves of reliability. Three examples are given of
               simulations of digital systems which have redundancy. The
               difference between this type of simulation and other
               simulation techniques is discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/119/CSL-TR-76-119.pdf

%R CSL-TR-76-120
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Detection of intermittent faults in sequential circuits
%A Savir, Jacob
%D March 1978
%X Testing for intermittent faults in digital circuits has been
               given significant attention in the past few years. However,
               very little theoretical work was done regarding their
               detection in sequential circuits.
               This paper shows that the testing properties of intermittent
               faults in sequential circuits can be studied by means of a
               probabilistic automaton. The evaluation and derivation of
               optimal intermittent fault detection experiments in
               sequential circuits is done by creating a product state table
               from the faulty and fault-free versions of the circuit under
               test. Both deterministic and random test procedures are
               discussed. The underlying optimality criterion maximizes the
               probability of fault detection.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/120/CSL-TR-76-120.pdf

%R CSL-TR-76-123
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Research in the Digital Systems Laboratory
%A Faculty, The
%D June 1976
%X This report summarizes the research carried out in the
               Digital Systems Laboratory* at Stanford University during the
               period August 1975 through July 1976.
               Research investigations were concentrated into the following
               major areas: Computer Performance; Computer Reliability
               Studies, including fault-tolerant computing, evaluation of
               dual-computer configurations, and implementation of reliable
               software systems; Computer Architecture, including
               organization of computer systems, feasibility of real-time
               emulation, and directly executed languages; Design Automation
               of Digital Systems; Computer Networks, including network
               interconnection protocols, the 2000 terminal computing
               system, and packet-switched network technology/cost studies;
               LSI Multiprocessors; Compiler Implementation; and Parallel
               Computer Systems. Renamed Computer Systems Laboratory in 1978.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/123/CSL-TR-76-123.pdf

%R CSL-TR-76-125
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance bounds for parallel processors
%A Lee, Ruby Bei-Loh
%D November 1976
%X A general model of computation on a p-parallel processor is
               proposed, distinguishing clearly between the logical
               parallelism (p* processes) inherent in a computation, and the
               physical parallelism (p processors) available in the computer
               organization. This shows the dependence of performance bounds
               on both the computation being executed and the computer
               architecture. We formally derive necessary and sufficient
               conditions for the maximum attainable speedup of a p-parallel
               processor over a uniprocessor to be Sp 2 min( F(p,ln
               p), F(p*,ln p*)), where ln p approximates Hp , the pth.
               harmonic number. We also verify that empirically-derived
               speedups are 0( F(p*,ln p*)). Finally, we discuss related
               performance measures of minimum execution time, maximum
               efficiency and minimum space-time product.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/125/CSL-TR-76-125.pdf

%R CSL-TR-76-126
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The optimal placement of dynamic recovery checkpoints in recoverable 
              computer systems
%A Warren-Angelucci, Wayne
%D December 1976
%X Reliability is an important concern of any computer system.
               No matter how carefully designed and constructed, computer
               systems fail. The rapid and systematic restoration of service
               after an error or malfunction is always a major design and
               operational goal. In order to overcome the effects of a
               failure, recovery must be performed to go from the failed
               sate to an operational state. This thesis describes a
               recovery method which guarantees that a computer system, its
               associated data bases and communication transactions will be
               restored to an operational and consistent state within a
               given time and cost bound after the occurrence of a system
               failure.
               This thesis considers the optimization of a specific software
               strategy - the rollback and recovery strategy, within the
               framework of a graph model of program flow which encompasses
               communication interfaces and data base transactions.
               Algorithms are developed which optimize the placement of
               dynamic recovery checkpoints. Presented is a method for
               statically pre-computing a set of optimal decision parameters
               for the associated program model, and run-time technique for
               dynamically determining the optimal placement of program
               recovery checkpoints.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/76/126/CSL-TR-76-126.pdf

%R CSL-TR-77-129
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T On accuracy improvement and applicability conditions of diffusion 
               approximation with applications to modelling of computer systems
%A Yu, Philip S.
%D January 1977
%X Starting with single server queueing systems, we find a
               different way to estimate the diffusion parameters. The
               boundary condition is handled using the Feller's elementary
               return process. Extensive comparisons by asymptotic,
               simulation and numerical techniques have been conducted to
               establish the superiority of the proposed method compared
               with conventional methods. The limitation of the diffusion
               approximation is also investigated. When the coefficient of
               variation of interarrival time is larger than one, the mean
               queue length may vary over a wide range even if the mean and
               variance of interarrival time are kept unchanged. The
               diffusion approximation is applicable under the condition
               that the high variation of interarrival time conducted on
               2-stage hyperexponential distributions. A similar anomaly is
               observed in two server closed queueing networks when the
               service time of any server has a large coefficient of
               variation. Again, a similar regularity condition on the
               service time distribution is required in order for the
               diffusion approximation to be applicable. For general
               queueing networks, the problems become more complicated. A
               simple way to estimate the coefficient of variation of
               interarrival time (when the network is decomposable) is
               proposed. Besides the anomalies cited before, networks under
               certain topologies, such as networks with feedback loops,
               especially self loops, can not be decomposed into separate
               single servers when the coefficient of variation of service
               time distributions become large, even if the large variations
               are due to a large number of short service times.
               Nevertheless, the decomposability of a network can be
               improved by replacing each server with a self loop by an
               equivalent server without a self loop. Finally, we consider
               the service center with a queue dependent service rate or
               arrival rate. Generalization to two server closed queueing
               networks where each server may have a self loop is also
               considered.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/129/CSL-TR-77-129.pdf

%R CSL-TR-77-130
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The structure of directly executed languages: a new theory of interpretive system design
%A Hoevel, Lee W.
%A Flynn, Michael J.
%D March 1977
%X This paper concerns two important issues in the design of
               optimal languages for direct execution in an interpretive
               system: binding the operand identifiers in an executable
               instruction unit to the arguments of the routine
               implementingthe operator defined by that instruction; and
               binding operand identifiers to execution variables. These
               issues are central to the performance of a system both in
               space and time.
               Historically, some form of "machine language" is used as the
               directly executable medium for a computing system. These
               languages traditionally are constrained to a single
               "n-address" instruction format; this leads to an excessive
               number of "overhead" instructions that do nothing but move
               values from one storage resource to another being imbedded in
               the executable instruction stream. We propose to reduce this
               overhead by increasing the number of instruction formats
               available at the directly executed language level.
               Machine languages are also constricted with respect to the
               manner in which operands can be "addressed" within an
               instruction. Usually, some form of indexed base-register
               scheme is available, along with a direct addressing mechanism
               for a few, "special" storage cells (i.e., registers and
               perhaps the zeroth page of main store). We propose a
               different identification mechanism--based on the Contour
               Model of Johnston. Using our scheme, only N bits are needed
               to encode any identifier in a scope containing less than 2**N
               distinct identifiers.
               Together, these two results lead to directly executed
               language designs which are optimal in the sense that: (1) k
               executable instructions are required to implement a source
               statement containing k functional operators; (2) the space
               required to represent the executable form of a source
               statement containing k distinct functional operators and v
               distinct variables approaches F*k + N*v -- where there are
               less than 2**F distinct functional operators in the scope of
               definition for the source statement, and less than 2**N
               distinct variables in this scope. (3) the time needed to
               execute the representation of a source statement containing k
               functional operators, d distinct variables in its domain, and
               r distinct variables in its range approaches d + r + k ;
               where time is measured in memory references.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/130/CSL-TR-77-130.pdf

%R CSL-TR-77-131
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Sequential prefetch strategies for instructions and data
%A Rau, B. Ramakrishna
%D January 1977
%X An investigation of sequential prefetch as a means of
               reducing the average access time is conducted. The use of a
               target instruction buffer is shown to enhance the performance
               of instruction prefetch. The concept of generalized
               sequentiality is developed to enable the study of
               sequentiality in data streams. Generalized sequentiality is
               shown to be present to a significant degree in data streams
               from measurements on representative programs. This results is
               utilized to develop a data prefetch mechanism which is found
               to be capable of anticipating, on the average, about 75% of
               all data requests.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/131/CSL-TR-77-131.pdf

%R CSL-TR-77-132
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Manual for a general purpose simulator used to evaluate reliability of digital systems
%A Thompson, Peter A.
%D August 1977
%X A simulation technique has been developed for the reliability
               evaluation of arbitrarily defined computer systems. The main
               simulation program is written in FORTRAN IV, and requires no
               changes to simulate many different systems. The user defines
               a model for a particular system by supplying a set of short
               FORTRAN subroutines, and a specially formatted block of
               numerical parameters. The subroutines specify the functional
               behavior of various subsystems comprising the model, while
               the numerical parameters describe how the subsystems are
               interconnected, their time delays what faults occur in each
               one, etc. The main simulation program uses this model to
               perform a Monte-Carlo type evaluation of the systems'
               reliability.
               This report supplements a basic description of the technique
               by supplying all the details necessary for writing
               subroutines, specifying numerical parameters, and using the
               main simulation program. The simulation is event-driven, and
               automatically generates pseudo-random faults and time delays
               according to parameters given by the user. Some problems
               typical of event simulators, such as ambiguities arising from
               random time-delay generation, can be solved by taking
               advantage of special facilities built into the simulation
               package. A complete source listing of the main program is
               included for reference.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/132/CSL-TR-77-132.pdf

%R CSL-TR-77-134
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design of two-level fault-tolerant networks form threshold elements
%A Butakov, Evguenij A.
%A Posherstnick, Marat S.
%D March 1977
%X Only a small part of all Boolean functions of n-variables can
               be realized by one threshold element (T.E.). For all other
               functions the net must be built with at least two T.E.'s. The
               problem of constructing a fault-tolerant two-level network
               from T.E. is investigated. The notion of limiting function is
               introduced. It is shown that the use of these limiting
               functions induces a reduction in the number of possible
               candidates during the process of finding a realization of an
               arbitrary function by threshold functions. The method is
               based on the two-asummability property of threshold functions
               and therefore is applicable to completely specified Boolean
               functions with less than nine variables.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/134/CSL-TR-77-134.pdf

%R CSL-TR-77-135
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Passage time distributions for a class of queueing networks: closed, open, or mixed,
	with difference classes of customers with applications to computer system modeling
%A Yu, Philip S.
%D March 1977
%X Networks of queues are important models of multiprogrammed
               time-shared computer systems and computer communication
               networks. Although equilibrium state probabilities of a broad
               class of network models have been derived in the past,
               analytic or approximate solutions for response time
               distributions or more general passage time distributions are
               still open problems. In this paper we formulate the passage
               time problem as a "hitting time" or "first passage time"
               problem in a Markov system and derive the analytic solution
               to passage time distributions of closed queueing networks.
               Efficient numerical approximation is also proposed. The
               result for closed queueing networks is further extended to
               obtain approximate passage time distributions for open
               queueing networks. Finally, we employ the techniques derived
               in this paper to study the interfault time and response time
               distribution and density functions of multiprogramming, size
               of main memory, service time of paging devices and rate of
               file I/O requests on the shape of distribution functions and
               density functions have been examined.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/135/CSL-TR-77-135.pdf

%R CSL-TR-77-136
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A structural design language for computer aided design of digital systems
%A vanCleemput, Willem M.
%D April 1977
%X In this report a language (SDL) for describing structural
               properties of digital systems will be presented. SDL can be
               used at all levels of the design process, i.e. from the
               system level down to the circuit level. The language is
               intended as a complement to existing computer hardware
               description languages, which emphasize behavioral
               description. The language was motivated partly by the nature
               of the design process.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/136/CSL-TR-77-136.pdf

%R CSL-TR-77-137
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance analysis of computer communication networks via random access channels
%A Yu, Philip S.
%D April 1977
%X The field of computer communication networks has grown very
               rapidly in the past few years. One way to communicate is via
               multiple access broadcast channels. A new class of random
               access schemes referred to as the Mp-persistent CSMA scheme
               is proposed. It incorporates the nonpersistent CSMA scheme
               and the 1-persistent CSMA scheme, both slotted and unslotted
               versions, as its special cases with p=0 and 1, respectively.
               The performance of the Mp-persistent CSMA scheme under packet
               switching is analyzed and compared with other random access
               schemes. By dynamically adjusting p, the unslotted version
               can achieve better performance in both throughput and delay
               than the currently available unslotted CSMA schemes under
               packet switching. Furthermore, the performance of various
               random access schemes under message switching is analyzed and
               compared with that under packet switching. In both slotted
               and unslotted versions of the M0-persistent CSMA scheme, the
               performance under message switching is superior to that under
               packet switching in the sense that not only the channel
               capacity is larger but also the average number of
               retransmissions per successful message under message
               switching is smaller than that per successful packet under
               packet switching. In dynamic reservation schemes, message
               switching leads to larger channel capacity. However, in both
               slotted and unslotted versions of the ALOHA scheme, the
               channel capacity is reduced when message switching is used
               instead of packet switching. This phenomenon may also happen
               in the Mp-persistent CSMA scheme as p deviates from 0 to 1
               for certain distributions of message length. Hence, the
               performance under message switching may be superior to or
               inferior to that under packet switching depending upon the
               random access scheme being used and the distribution of
               message length (usually a large coefficient of variation of
               message length implies a large degradation of channel
               capacity in this case) for certain random access schemes.
               Nevertheless, for radio channels, message switching can
               achieve larger channel capacity if appropriate CSMA schemes
               are used. A mixed strategy which is a combination of message
               switching and packet switching is proposed to improve the
               performance of a point to point computer communication
               network when its terminal access networks communicate via
               highly utilized radio channels.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/137/CSL-TR-77-137.pdf

%R CSL-TR-77-138
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Program behavior and the performance of interleaved memories
%A Rau, B. Ramakrishna
%D May 1977
%X One of the major factors influencing the performance of an
               interleaved memory system is the behavior of the request
               sequence, but this is normally ignored. This report examines
               this issue. Using trace driven simulations it is shown that
               the commonly used assumption, that all requests are equally
               likely to be to any module, is not valid. The duality of
               memory interference with paging is noted and this suggests
               the use of the Least-Recently-Used Stack Model to model
               program behavior. Simulation shows that this model is quite
               successful. An accurate expression for the bandwidth is
               derived based upon this model.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/138/CSL-TR-77-138.pdf

%R CSL-TR-77-139
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Properties and applications of the least-recently-used stack model
%A Rau, B. Ramakrishna
%D May 1977
%X The Least-Recently-Used Stack Model (LRUSM) is known to be a
               good model of temporal locality. Yet, little analysis of this
               model has been performed and documented. Certain properties
               of the LRUSM are developed here. In particular, the concept
               of the Stack Working Set is introduced and expressions are
               derived for the forward recurrence time to the next reference
               to a page, for the time that a page spends in a cache of a
               given size and for the time from last reference to the page
               being replaced. The fault stream out of a cache memory is
               modelled and it is shown how this can be used to partially
               analyze a multilevel memory hierarchy. In addition, the Set
               Associative Buffer is analyzed and a necessary and sufficient
               condition for the optimality of the LRU replacement algorithm
               is advanced.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/139/CSL-TR-77-139.pdf

%R CSL-TR-77-142
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Optimal layout of CMOS functional arrays
%A Uehara, T.
%A vanCleemput, Willem M.
%D March 1978
%X Designers of MOS LSI circuits can take advantage of complex
               functional cells in order to achieve better performance. This
               paper discusses the implementation of a random logic function
               on an array of CMOS transistors. A graph-theoretical
               algorithm which minimizes the size of an array is presented.
               This method is useful for the design of cells used in
               conventional design automation systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/142/CSL-TR-77-142.pdf

%R CSL-TR-77-143
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T SPRINT - an interactive system for printed circuit board design user's guide
%A vanCleemput, Willem M.
%A Bennett, T. C.
%A Hupp, J. A.
%A Stevens, K. R.
%D June 1977
%X The SPRINT system; for the design of printed circuit boards
               is a collection of programs that allows designers to
               interactively design two-sided boards using a Tektronix 4013
               graphics terminal. The major parts of the system are: a
               compiler for SDL, the Structure Design Language, an
               interactive component placement program, an interactive
               manual conductor routing program, an automatic batch router,
               a via elimination program and a set of artwork generation
               programs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/143/CSL-TR-77-143.pdf

%R CSL-TR-77-147
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Verifying concurrent programs with shared date classes
%A Owicki, Susan S.
%D August 1977
%X Monitors are a valuable tool for organizing operations on
               shared data in concurrent programs. In some cases, however,
               the mutually exclusive procedure calls provided by monitors
               are overly restrictive. Such applications can be programmed
               using shared classes, which do not enforce mutual exclusion.
               This paper presents a method of verifying parallel programs
               containing shared classes. One first proves that each class
               procedure performs correctly when executed by itself, then
               shows that simultaneous execution of other class procedures
               can not interfere with its correct operation. Once a class
               has been verified, calls to its procedures may be treated as
               uninterruptable actions; this simplifies the proof of
               higher-level program components. Proof rules for classes and
               procedure calls are given in Hoare's axiomatic style. Several
               examples are verified, including two versions of the readers
               and writers problem and a dynamic resource allocator.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/147/CSL-TR-77-147.pdf

%R CSL-TR-77-149
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Interpretive machines
%A Iliffe, John K.
%D June 1977
%X These lectures survey attempts to apply computers directly to
               high level languages using microprogrammed interpreters. The
               motivation for such work is to achieve language
               implementations that are more effective in some measure of
               translation, execution or response to the user than would
               otherwise be obtained. The implied comparison is with the
               established technique of compiling into a fixed
               general-purpose machine code prior to execution. It is argued
               that while substantial benefits can be expected from
               microprogramming it does not represent the best approach to
               design when the contributing factors are analyzed in a
               general system context, that is to say when wide performance
               range, multiple source language, and stringent security
               requirements have to be satisfied. An alternative is
               suggested, using a combination of interpretation and a
               primitive instruction set and providing security at the
               microprogram level.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/149/CSL-TR-77-149.pdf

%R CSL-TR-77-150
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Research in the Digital Systems Laboratory: August 1976-July 1977
%A Staff
%D July 1977
%X This report summarizes the research carried out in the
               Digital Systems Laboratory at Stanford University during the
               period from August 1976 through July 1977.
               Research investigations were concentrated in the following
               areas: Computer Reliability and Testing, including detection
               of intermittent failures, testing for sequential circuits,
               self-checking linear feedback shift registers, simulation
               analysis of high-reliability systems, effects of failures on
               gracefully degradable systems, fault diagnosis in digital
               systems, and software reliability; Critical Fault-Pattern
               Determination; Computer Architecture, including trace
               facility, memory interleaving, and monitors for signal
               activity; Organization of Computer Systems, including an
               emulation research laboratory, emulators, and memory
               performance; Feasibility of Real-Time Emulation, including
               directly executable languages; Distributed Date Processing
               for Ballistic Missile Defense; Description Languages and
               Design for General-Purpose Computer Architectures, including
               evaluation of existing hardware description languages,
               development of a structural description language,
               applications of the structural design language, bounds for
               maximal parallelism, and parallel information processing in
               bilogical systems; Computer Networks, including broadcast
               protocols in packet-switched computer networks and the
               optimal placement of dynamic-recovery checkpoints in
               recoverable computer systems; Design and Verification of
               Reliable Software including specifications and proofs for
               abstract data types in concurrent programs, specification and
               verification of monitors, and operating system design; Design
               Automation, including a language for describing the structure
               of digital systems, the SPRINT printed-circuit design system,
               computer-aided layout of large-scale integrated circuits, and
               an interactive system for design capture; Database, including
               studies in distributed processing and problem solving, a
               database maintenance system, and the implementation of
               database in medicine; and Digital Incremental Computers.
               Renamed Computer Systems Laboratory in 1978.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/77/150/CSL-TR-77-150.pdf

%R CSL-TR-78-154
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Notes on modelling of computer systems and networks
%A Yu, Philip S.
%D April 1978
%X Formulation of given computer system or network problems into
               abstract stochastic models is considered. Generally speaking,
               model formulation is an art. While analytic results are
               clearly not powerful enough to provide a "cookbook" approach
               to modelling, general methodology and difficulties on model
               formulation are discussed through examination of various
               computer system and network models. These models are
               presented in a systematic way based on the hierarchical
               approach.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/154/CSL-TR-78-154.pdf

%R CSL-TR-78-155
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The formal definition of a real-time language
%A Hennessy, John L.
%A Kieburtz, Richard B.
%D July 1978
%X This paper presents the formal definition of TOMAL
               (Task-Oriented Microprocessor Application Language), a
               programming language intended for real-time systems running
               on small processors. The formal definition addresses all
               aspects of the language. Because some modes of semantic
               definition seem particularly well-suited to certain aspects
               of a language, and not as suitable for others, the formal
               definition employs several, complementary modes of
               definition.
               The primary definition is axiomatic in the notation of Hoare;
               it is employed to define most of the transformations of data
               control states affected by statements of the language.
               Simple, denotational (but not lattice-theoretic) semantics
               complement the axiomatic semantics to define type-related
               features, such as the finding of names to types, data type
               coercions, and the evaluation of expressions. Together, the
               axiomatic and denotational semantics define all the features
               of the sequential language.
               An operational definition, not included in the paper, is used
               to define real-time execution, and to extend the axiomatic
               definition to account for all aspects of concurrent
               execution. Semantic constraints, sufficient to guarantee
               conformity of a program with the axiomatic definition, can be
               checked by analysis of a TOMAL program at compilation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/155/CSL-TR-78-155.pdf

%R CSL-TR-78-156
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Optimal program control structures based on the concept of decision entropy
%A Lee, Ruby Bei-Loh
%D July 1978
%X The ability to make decisions dynamically during program
               execution is a very powerful and valuable tool.
               Unfortunately, it also causes severe performance degradations
               in high-speed computer organizations which use parallel,
               pipelined or lookahead techniques to speed up program
               execution. An optimal control structure is one where the
               average number of decisions to be made during program
               execution is minimal among all control structures for the
               program. Since decisions are usually represented by
               conditional branch instructions, finding an optimal control
               structure is equivalent to minimizing the expected number of
               conditional branch instructions to be encountered per program
               execution.
               By decision entropy, we mean a quantitative characterization
               of the uncertainty in the instruction stream due to dynamic
               decisions imbedded in the program. We define this concept of
               decision entropy in the Channon information-theoretic sense.
               We show that a program's intrinsic decision entropy is an
               absolute lower bound on the expected number of decisions, or
               conditional branch instructions, per program execution. We
               show that this lower bound is achieved if each decision has
               maximum uncertainty. We also indicate how optimal control
               structures may be constructed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/156/CSL-TR-78-156.pdf

%R CSL-TR-78-157
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Syndrome-testable design of combinational circuits
%A Savir, Jacob
%D October 1978
%X Classical testing of combinational circuits requires a list
               of the fault-free responses of the circuit to the test set.
               For most practical circuits implemented today the large
               storage requirement for such a list make such a test
               procedure very expensive.
               In this paper we describe a method of designing combinational
               circuits in such a way that their test procedure will require
               the knowledge of only one characteristic of the fault-free
               circuit, called the syndrome. This solves the storage problem
               associated with the test procedure. It is shown that the
               syndrome-testable design is inexpensive and can be easily
               implemented by the logic designer.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/157/CSL-TR-78-157.pdf

%R CSL-TR-78-158
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance characterization of parallel computations
%A Lee, Ruby Bei-Loh
%D September 1978
%X This paper defines and interprets quantitative measures by
               which we may characterize the absolute and relative
               performance of a parallel computation, compared with an
               equivalent serial computation. The absolute performance
               measures are the Parallelism Index, PI(P), the Utilization,
               U(P), and the maximum Quality, Q(P). The corresponding
               relative performance measures are the Speedup, S(P,1), the
               Efficiency, E(P,1), and the Quality, Q(P,1). We show how the
               corresponding absolute and relative performance measures are
               related via the Redundancy measure, R(P,1). We also examine
               the range of permissible values for each performance measure.
               Ideally, we would like to compare an optimal parallel
               computation with an optimal equivalent serial computation, in
               order to determine the performance improvements due solely to
               parallel versus serial processing. Toward this end, we define
               optimal parallel and serial computations, and show such
               optimality may be approximated in practice.
               In order to facilitate the calculation of the above
               performance measures, we show how the complexity of modelling
               an arbitrary parallel computation may be reduced
               substantially to two simple canonical forms, which we denote
               the computation's Parallelism Profile and TOP-form.
               Finally we show how all the canonical forms and performance
               measures may be generalized from one computation to a set of
               computations, to arrive at aggregate canonical and
               performance descriptions.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/158/CSL-TR-78-158.pdf

%R CSL-TR-78-159
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Specification and verification of network mail system
%A Owicki, Susan S.
%D November 1978
%X Techniques for describing and verifying modular systems are
               illustrated using a simple network mail problem. The design
               is presented in a top-down style. At each level of
               refinement, the specifications of the higher level are
               verified from the specifications of lower level components.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/159/CSL-TR-78-159.pdf

%R CSL-TR-78-163
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An introduction to the DDL-P language
%A Cory, Wendell E.
%A Duley, J. R.
%A vanCleemput, Willem M.
%D March 1979
%X This report describes the Pascal-based implementation of DDL
               (Digital Design Language) and its simulator.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/163/CSL-TR-78-163.pdf

%R CSL-TR-78-164
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T DDL-P command language manual
%A Cory, Wendell E.
%A Duley, J. R.
%A vanCleemput, Willem M.
%D March 1979
%X This report describes the command language for the simulator,
               associated with DDL (Digital Design Language).
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/164/CSL-TR-78-164.pdf

%R CSL-TR-78-165
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Partitioning of digital systems
%A Mei, K.
%A vanCleemput, Willem M.
%A Blount, M.
%A Hanson, D. L.
%A Payne, Thomas S.
%A Savir, Jacob
%A Scheffer, L. K.
%D April 1979
%X The aim of this study is to develop concepts and tools for
               understanding the influence of partitioning on the life-cycle
               cost of a system.
               Throughout this study three types of boards will be
               considered as examples to illustrate the concepts being
               developed. These three board types are being used by the U.S.
               Navy for various types of equipment. The types considered
               are:
               Type 1A: A small PC card with space for up to 8 IC's and a
               single 40-pin connector.
               Type 2A: A PC card with space for up to 18 IC's and a single
               100-pin connector.
               Type 5X: A PC card with space for up to 55 IC's and two
               connectors: a 100-pin connector for the back plane connection
               and a 30-pin test point connector to be used for diagnostic
               purposes only.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/78/165/CSL-TR-78-165.pdf

%R CSL-TR-79-168
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T UFORT: a Fortran-to-Universal PCODE Translator (FIXFOR-2)
%A Chow, Frederick
%A Nye, Peter
%A Wiederhold, Gio
%D January 1980
%X The Fortran compiler described in this document, UFORT, was
               written specifically to serve in a Pascal environment using
               the Universal P-Code as an intermediate pseudomachine. The
               need for implementation of Fortran these days is due to the
               great volume of existing Fortran programs, rather than to a
               desire to have this language available to develop new
               programs. We have hence implemented the full, but traditional
               Fortran standard, rather than the recently adopted augmented
               Fortran standard. All aspects of Fortran which are commonly
               used in large scientific programs are available, including
               such features as SUBROUTINES, labelled COMMON, and COMPLEX
               arithmetic. In addition, a few common extensions, such as
               integers of different lengths and assignment of strings to
               variables, have been added.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/168/CSL-TR-79-168.pdf

%R CSL-TR-79-170
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Interpretive architectures: a theory of ideal language machines
%A Flynn, Michael J.
%A Hoevel, Lee
%D February 1979
%X This paper is a study in ideal computer architectures or
               program representations. An ideal architecture can be defined
               with respect to the representation that was used to
               originally describe a program, i.e. the higher level
               language.
               Traditional machine architectures name operations and objects
               which are presumed to be present in the host machine: a
               memory space of certain size, ALU operations, etc. An ideal
               machine framed about a specific higher level language assumes
               operations present in that language and uses these operations
               to describe relationships between objects described in the
               source representation.
               The notion of ideal is carefully constrained. The object
               program representation must be easily decompilable, (i.e. the
               source is readily reconstructable). It is simply assumed that
               the source itself is a good representation for the original
               problem, thus any nonassignment operation present in the
               source program statement will appear as a single instruction
               (operation) in the ideal representation. All named objects
               are defined with respect to the natural scope of definition
               of the source program. For simplicity of discussion,
               statistical behavior of the program or the language is
               assumed to be unknown; that is, Huffman codes are not used.
               From the above, a canonic interpretive form (CIF) or measure
               of a higher level language program is developed. CIF measures
               both static space to represent the program and dynamic time
               measurements of the number of instructions to be interpreted
               and the number of memory references these instructions will
               require. The CIF or ideal program representation is then
               compared using the Whetstone benchmark in its characteristics
               to several contemporary architectural approaches; IBM 370
               Honeywell Level 66, Burroughs S-Language Fortran and DELtran,
               a quasi-ideal Fortran architecture based on CIF principles.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/170/CSL-TR-79-170.pdf

%R CSL-TR-79-171
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A theory of interpretive architectures: some notes on DEL design and
	a Fortran case study
%A Hoevel, Lee
%A Flynn, Michael J.
%D February 1979
%X An interpretive architecture is a program representation that
               peculiarly suits a particular high level language or class of
               languages. The architecture is a program representation which
               we call a directly executed language (DEL). In a companion
               paper we have explored the theory involved in the creation of
               ideal DEL forms and have analyzed how some traditional
               instruction sets compare to this measure.
               This paper is an attempt to develop a reasonably
               comprehensive theory of DEL synthesis. By assuming a flexible
               interpretation oriented host machine, synthesis involves
               three particular areas: (1) sequencing; both between image
               machine instructions and within the host interpreter, (2)
               action rules including both format for transformation and
               operation invoked, and finally, (3) the name space which
               includes both name structure and name environment.
               A complete implementation of a simple version of FORTRAN is
               described in the appendix of the paper. This DEL for FORTRAN
               called DELtran comes close to achieving the ideal program
               measures.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/171/CSL-TR-79-171.pdf

%R CSL-TR-79-174
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Pascal*: a Pascal based systems programming language
%A Hennessy, John L.
%D June 1980
%X Pascal* (Pascal-star) is a new programming language which is
               upward compatible with standard Pascal and suitable for
               systems programming. Although there are several additions to
               the language, simplicity remains a major design goal. The
               major additions reflect trends evident in newer languages
               such as Euclid, Mesa, and Ada, including: modules, simple
               parametric types, structures constants and values, several
               minor extensions to the control structures of the language,
               random access files, arbitrary return types for functions,
               and an exception handling mechanism.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/174/CSL-TR-79-174.pdf

%R CSL-TR-79-175
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Symbolic debugging of optimized code
%A Hennessy, John L.
%D July 1979
%X The long standing conflict between the optimization of code
               and the ability to symbolically debug the code is examined.
               The effects of local and global optimizations on the
               variables of a program are categorized and models for
               representing the effect of optimizations are given. These
               models are used by algorithms which determine the subset of
               variables whose values do not correspond to those in the
               original program. Algorithms for restoring these variables to
               their correct values are also developed. Empirical results
               from the application of these algorithms to local
               optimization are presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/175/CSL-TR-79-175.pdf

%R CSL-TR-79-176
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T SYNDIA user's guide
%A Cory, Wendell E.
%D August 1979
%X This report describes how to use the Syndia/Syngra system
               available at SU-SCORE. This system accepts a BNF-like
               grammar; specifications and automatically generates syntax
               diagrams on a Tektronix graphics terminal. Syndia is the
               major component of this system; Syndgra acts acts as an
               interface between Syndia and the SUDS2 graphics editor.
               Syndia performs no ambiguity or consistency checks on the BNF
               input. This report assumes that the reader is familiar with
               BNF and syntax diagram representations of grammars.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/176/CSL-TR-79-176.pdf

%R CSL-TR-79-177
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T ADLIB user's manual
%A Hill, Dwight D.
%D August 1979
%X ADLIB (A Design Language for Indicating Behavior) is a new
               computer design language recently developed at Stanford.
               ADLIB is a superset of PASCAL with special facilities for
               concurrency and interprocess communication. It is normally
               used under the SABLE simulation system.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/177/CSL-TR-79-177.pdf

%R CSL-TR-79-178
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design automation at Stanford
%A vanCleemput, Willem M.
%D July 1979
%X This report contains a copy of the visual aids used by the
               authors during the presentation of their work at the First
               Workshop on Design Automation at Stanford, held July 3-4,
               1979.
               The topics covered a range from circuit level simulation and
               integrated circuit process modelling to high level languages
               and design techniques. The presentations are a survey of the
               activities in design automation at Stanford University.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/178/CSL-TR-79-178.pdf

%R CSL-TR-79-179
%Z Thu, 01 Dec 94 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Testability considerations in microprocessor-based design
%A Hayes, John P.
%A McCluskey, Edward J.
%D November 1979
%X This report contains a survey of testability conditions in
               microprocessor-based design. General issues of testability,
               testing methods, and fault modeling are presented. Specific
               techniques of testing and designing for testable
               microprocessor-based systems are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/79/179/CSL-TR-79-179.pdf

%R CSL-TR-95-661
%Z Thu, 02 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance Factors for Superscalar Processors
%A Bennett, James E.
%A Flynn, Michael J.
%D February 1995
%X This paper introduces three performance factors for
               dynamically scheduled superscalar processors. These factors,
               availability, efficiency, and utility, are then used to
               explain the variations in performance that occur with
               different processor and memory system features. The processor
               features that are investigated are branch prediction depth
               and following multiple branch paths. The memory system
               features that are investigated are cache size, associativity,
               miss penalty, and memory bus bandwidth. Dynamic scheduling
               with appropriate levels of bus bandwidth and branch
               prediction is shown to be remarkably effective at achieving
               good performance over a range of differing application types
               and over a range of cache miss rates. These results were
               obtained using a new simulation environment, MXS, which
               directly executes the benchmarks.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/661/CSL-TR-95-661.pdf

%R CSL-TR-95-662
%Z Thu, 02 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Limits of Scaling MOSFETs
%A McFarland, Grant
%A Flynn, Michael J.
%D January 1995
%X The fundamental electrical limits of MOSFETs are discussed
               and modeled to predict the scaling limits of digital bulk
               CMOS circuits. Limits discussed include subthreshold
               currents, time dependent dielectric breakdown (TDDB), hot
               electron effects, and drain induced barrier lowering (DIBL).
               This paper predicts the scaling of bulk CMOS MOSFETs to reach
               its limits at drawn dimensions of approximately 0.1um. These
               electrical limits are used to find scaling factors for SPICE
               Level 3 model parameters, and a scalable Level 3 device model
               is presented. Current trends in scaling interconnects are
               also discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/662/CSL-TR-95-662.pdf

%R CSL-TR-95-659
%Z Tue, 28 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T High-Speed BiCMOS Memories
%A Wingard, Drew Eric
%D December 1994
%X Existing BiCMOS static memories do not simultaneously combine
               the speed of bipolar memories with the low power and density
               of CMOS memories. Beginning with fundamentally fast low=swing
               bipolar circuits and zero-power CMOS storage latches, we
               introduce CMOS devices into the bipolar circuits to reduce
               the power dissipation without compromising speed and insert
               bipolar transistors into CMOS storage arrays to improve the
               speed without power nor density penalties.
               Replacing passive load resistors with switched PMOS
               transistors reduces the amount of power required to keep
               bipolar decoder outputs low. The access delay need not
               increase because the load resistance is quickly reduced via a
               low-swing signal when the decoder could switch. For ECL NOR
               decoders, we apply a variable BiCMOS current source that is
               simplified by carefully regulating the negative supply. We
               also develop techniques that improve the reading and writing
               characteristics of the CMOS-storage, emitter-access memory
               cell.
               The 16K-word 4-bit asynchronous CSEA memory was fabricated in
               a 0.8-micron BiCMOS technology and accesses in 3.7ns while
               using 1.75 W. An improved 64Kx4 design is simulated to run at
               3.4ns and 2.3W. Finally, a synchronous 4Kx64 CSEA memory is
               estimated to operate at 2.5ns and 2.4W in the same process
               technology.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/659/CSL-TR-95-659.pdf

%R CSL-TR-95-658
%Z Wed, 29 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T RYO: a Versatile Instruction Instrumentation Tool for PA-RISC
%A Z ucker, Daniel F.
%A Karp, Alan H.
%D January 1995
%X RYO (Roll Your Own) is actually a family of novel
               instrumentation tools for the PA-RISC family of processors.
               Relatively simple awk scripts, these tools instrument PA-RISC
               assembly instruction sequences by replacing individual
               machine instructions with calls to user written routines.
               Examples are presented showing how to generate address traces
               by replacing memory instructions, and how to analyze floating
               point arithmetic by replacing floating point instructions.
               This paper introduces the overall structure and design of
               RYO, as well as giving detailed instructions on its use.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/658/CSL-TR-95-658.pdf

%R CSL-TR-95-660
%Z Tue, 28 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Effects of Latency, Occupancy, and Bandwidth in
               Distributed Shared Memory Multiprocessors
%A Holt, Chris
%A Heinrich, Mark
%A Singh, Jaswinder Pal
%A Rothberg, Edward
%A Hennessy, John
%D January 1995
%X Distributed shared memory (DSM) machines can be characterized
               by four parameters, based on a slightly modified version of
               the logP model. The l (latency) and o (occupancy of the
               communication controller) parameters are the keys to
               performance in these machines, and are largely determined by
               major architectural decisions about the aggressiveness and
               customization of the node and network. For recent and
               upcoming machines, the g (gap) parameter that measures
               node-to-network bandwidth does not appear to be a bottleneck.
               Conventional wisdom is that latency is the dominant factor in
               determining the performance of a DSM machine. We show,
               however, that controller occupancy--which causes contention
               even in highly optimized applications--plays a major role,
               especially at low latencies. When latency hiding is used,
               occupancy becomes more critical, even in machines with high
               latency networks. Scaling the problem size is often used as a
               technique to overcome limitations in communication latency
               and bandwidth. We show that in many structured computations
               occupancy-induced contention is not alleviated by increasing
               problem size, and that there are important classes of
               applications for which the performance lost by using higher
               latency networks or higher occupancy controllers cannot be
               regained easily, if at all, by scaling the problem size.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/660/CSL-TR-95-660.pdf

%R CSL-TR-95-663
%Z Tue, 28 Mar 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic Technology Mapping for Asynchronous Designs
%A Siegel, Polly Sara Kay
%D March 1995
%X Asynchronous design styles have been increasing in popularity
               as device sizes shrink and concurrency is exploited to
               increase system performance. However, asynchronous designs
               are difficult to implement correctly because the presence of
               hazards, which are of little consequence to most parts of
               synchronous systems, can cause improper circuit operation.
               Many asynchronous design styles, together with accompanying
               automated synthesis algorithms, address the issues of design
               complexity and correctness. Typically, these synthesis
               systems take a high-level description of an asynchronous
               system and produce a logic-level description of the resultant
               design that is hazard-free for transitions of interest. The
               designer then must manually translate this logic-level
               description into a technology- specific implementation
               composed of an interconnection of elements from a semi-custom
               cell library. At this stage, the designer must be careful not
               to introduce new hazards into the design. The size of designs
               is limited in part by the inability to safely (and reliably)
               map the technology-independent description into an
               implementation.
               In this thesis, we address the problem of technology mapping
               for two different asynchronous design styles. We first
               address the problem for burst-mode designs. We developed
               theorems and algorithms for hazard-free mapping of burst-mode
               designs, and implemented these algorithms on top of an
               existing synchronous technology mapper. We incorporated this
               mapper into a toolkit for asynchronous design, and used the
               toolkit to implement a low-power infrared communications
               chip. We then extended this work to apply to the problem of
               hazard-free technology mapping of speed-independent designs.
               The difficulty in this design style is in the decomposition
               phase of the mapping algorithm, and we developed theory and
               algorithms for correct hazard-free decomposition of this
               design style. We also developed an exact covering algorithm
               which takes advantage of logic sharing within the design.
               These algorithms were then applied to benchmark circuits.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/663/CSL-TR-95-663.pdf

%R CSL-TR-95-664
%Z Wed, 19 Apr 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Nondeterministic Operators in Algebraic Frameworks
%A Meldal, Sigurd
%A Walicki, Michal Antonin
%D March 1995
%X A major motivating force behind research into abstract data
               types is the realization that software should be described in
               an abstract manner - on the one hand leaving open decisions
               regarding further refinement and on the other allowing for
               substitutivity of modules as long as they satisfy a
               particular specification.
               The use of nondeterministic operators is a useful abstraction
               tool: nondeterminism represents a natural abstraction
               whenever there is a hidden state or other components of a
               system description which are, methodologically, conceptually
               or technically, inaccessible at a particular level of
               specification granularity.
               In this report we explore the various approaches to dealing
               with nondeterminism within the framework of algebraic
               specifications. The basic concepts involved in the study of
               nondeterminism are introduced. The main alternatives for the
               interpretation of nondeterministic operations, homomorphisms
               between nondeterministic structures and equivalence of
               nondeterministic terms are sketched, and we discuss various
               proposals for initial and terminal semantics. We offer some
               comments on the continuous semantics of nondeterminism and
               the problem of solving recursive equations over signatures
               with binary nondeterministic choice. We also present the
               attempts at reducing reasoning about nondeterminism to
               reasoning in first order logic, and present a calculus
               dealing directly with nondeterministic terms. Finally,
               rewriting with nondeterminism is discussed: primarily as a
               means of reasoning, but also as a means of assigning
               operational semantics to nondeterministic specifications.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/664/CSL-TR-95-664.pdf

%R CSL-TR-95-666
%Z Thu, 13 Apr 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T ON DIVISION AND RECIPROCAL CACHES
%A Oberman, Stuart F.
%A Flynn, Michael J.
%D April 1995
%X Floating-point division is generally regarded as a high
               latency operation in typical floating-point applications.
               Many techniques exist for increasing division performance,
               often at the cost of increasing either chip area, cycle time,
               or both. This paper presents two methods for decreasing the
               latency of division. Using applications from the SPECfp92 and
               NAS benchmark suites, these methods are evaluated to
               determine their effects on overall system performance. The
               notion of recurring computation is presented, and it is shown
               how recurring division can be exploited using an additional,
               dedicated division cache. Additionally, for
               multiplication-based division algorithms, reciprocal caches
               can be utilized to store recurring reciprocals. Due to the
               similarity between the algorithms typically used to compute
               division and square root, the performance of square root
               caches is also investigated. Results show that reciprocal
               caches can achieve nearly a 2X reduction in effective
               division latency for reasonable cache sizes.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/666/CSL-TR-95-666.pdf

%R CSL-TR-95-667
%Z Mon, 05 Jun 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Better Optical Triangulation through Spacetime Analysis
%A Curless, Brian
%A Levoy, Marc
%D April 1995
%X The standard methods for extracting range data from optical
               triangulation scanners are accurate only for planar objects
               of uniform reflectance illuminated by an incoherent source.
               Using these methods, curved surfaces, discontinuous surfaces,
               and surfaces of varying reflectance cause systematic
               distortions of the range data. Coherent light sources such as
               lasers introduce speckle artifacts that further degrade the
               data. We present a new ranging method based on analyzing the
               time evolution of the structured light reflections. Using our
               spacetime analysis, we can correct for each of these
               artifacts, thereby attaining significantly higher accuracy
               using existing technology. We present results that
               demonstrate the validity of our method using a commercial
               laser stripe triangulation scanner.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/667/CSL-TR-95-667.pdf

%R CSL-TR-95-668
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Architecture Evaluator's Work Bench and its Application to
               Microprocessor Floating Point Units
%A Fu, Steve
%A Quach, Nhon
%A Flynn, Michael
%D June 1995
%X This paper introduces Architecture Evaluator's
               Workbench(AEWB), a high level design space exploration
               methodology, and its application to floating point
               units(FPUs). In applying AEWB to FPUs, a metric for
               optimizing and comparing FPU implementations is developed.
               The metric -- FUPA incorporates four aspects of AEWB --
               latency, cost, technology and profiles of target
               applications. FUPA models latency in terms of delay, cost in
               terms of area, and profile in terms of percentage of
               different floating point operations. We utilize sub-micron
               device models, interconnect models, and actual microprocessor
               scaling data to develop models used to normalize both latency
               and area enabling technology-independent comparison of
               implementations. This report also surveys most of the state
               of the art microprocessors, and compares them utilizing FUPA.
               Finally, we correlate the FUPA results to reported SPECfp92
               results, and demonstrate the effect of circuit density on
               FUPA implementations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/668/CSL-TR-95-668.pdf

%R CSL-TR-95-669
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Testing BiCMOS and Dynamic CMOS Logic
%A Ma, Siyad
%D June 1995
%X In a normal integrated circuit (IC) production cycle,
               manufactured ICs are tested to remove defective parts. The
               purpose of this research is to study the effects of real
               defects in BiCMOS and Dynamic CMOS circuits, and propose
               better test solutions to detect these defects. BiCMOS and
               Dynamic CMOS circuits are used in many new high performance
               VLSI ICs.
               Fault models for BiCMOS and Dynamic CMOS circuits are
               discussed first. Shorted and open transistor terminals, the
               most common failure modes in MOS and bipolar transistors, are
               simulated for BiCMOS and Dynamic CMOS logic gates.
               Simulations show that a faulty behavior similar to data
               retention faults in memory cells can occur in BiCMOS and
               Dynamic CMOS logic gates. We explain here why it is important
               to test for these faults, and present test techniques that
               can detect these faults.
               Simulation results also show that shorts and opens in Dynamic
               CMOS and BiCMOS circuits are harder to test than their
               counterparts in Static CMOS circuits. Simulation results also
               show that the testability of opens in BiCMOS gates can be
               predicted without time-consuming transistor-level
               simulations. We present a prediction method based on an
               extended switch-level model for BiCMOS gates.
               To improve the testability of dynamic CMOS circuits,
               design-for-testability circuitry are proposed. Scan cell
               designs add scan capabilities to dynamic latches and
               flip-flops with negligible performance overhead, while
               design-for-current-testability circuitry allows quiescent
               supply current (IDDQ) measurements for dynamic CMOS circuits.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/669/CSL-TR-95-669.pdf

%R CSL-TR-95-670
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design and Analysis of Update-Based Cache Coherence Protocols
               for Scalable Shared-Memory Multiprocessors
%A Glasco, David Brian
%D June 1995
%X This dissertation examines the performance difference between
               invalidate-based and update-based cache coherence protocols
               for scalable shared-memory multiprocessors. The first portion
               of the dissertation reviews cache coherence. First, chapter 1
               describes the cache coherence problem and identifies the two
               classes of cache coherence protocols, invalidate-based and
               update-based. The chapter also reviews bus-based protocols
               and reviews the additional requirements placed on the
               protocols to extend them to scalable systems. Next, chapter 2
               reviews two latency tolerating techniques, relaxed memory
               consistency models and software-controlled data prefetch, and
               examines their impact on the cache coherence protocols.
               Finally, chapter 3 reviews the details of three
               invalidate-based protocols defined in the literature and
               defines two new update-based protocols.
               The second portion of this dissertation examines the
               performance differences between invalidate-based and
               update-based protocols. First, chapter 4 presents the
               methodology used to examine the performance of the protocols.
               This presentation includes a discussion of the simulation
               environment, the simulated architecture and the scientific
               applications. Next, chapter 5 describes and analyzes the
               performance of two enhancements to the update-based cache
               coherence protocols. The first enhancement, a fine-grain or
               word based synchronization scheme, combines data
               synchronization with the data. This allows the system to take
               advantage of the fine-grain data updates which result from
               the update-based protocols. The second enhancement, a write
               grouping scheme, is necessary to reduce the network traffic
               generated by the update-based protocols. Next, chapter 6
               presents and discusses the simulated results that demonstrate
               that update-based protocols, with the two enhancements, can
               significantly improve the performance of the fine-grain
               scientific applications examined compared to invalidate-based
               protocols. Chapter 7 examines the sensitivity of the
               protocols to changes in the architectural parameters and to
               migratory data. Finally chapter 8 discusses how the choice of
               protocols affect the correctness, cost and efficiency of the
               cache coherence mechanism.
               Overall, this work demonstrates that update-based protocols
               can be used not only as a coherence mechanism, but also as a
               latency reducing and tolerating technique to improve the
               performance of a set of fine-grain scientific applications.
               But as with other latency reducing techniques, such as data
               prefetch, the technique must be used with an understanding of
               its consequences.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/670/CSL-TR-95-670.pdf

%R CSL-TR-95-671
%Z Wed, 21 Jun 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Characterization and reduction of metastability errors in CMOS interface circuits
%A Portmann, Clemenz Lenard
%D June 1995
%X In synchronous digital logic systems, asynchronous external
               signals must be referenced to the system clock or
               synchronized. Synchronization of asynchronous signals,
               however, inevitably leads to metastability errors.
               Metastability error rates can increase by orders of magnitude
               as clock frequencies increase in high performance designs,
               and supply voltages decrease in low- power designs. This
               research focuses on the characterization of metastability
               parameters and error reduction with no penalty in circuit
               performance. Two applications, high-speed flash analog-
               to-digital conversion and synchronization of asynchronous
               binary signals in application-specific integrated circuits
               have been investigated.
               Applications such as telecommunications and instrumentation
               for time-domain analysis require analog-to-digital converters
               with metastability error probabilities on the order of 10^-10
               errors/ cycle, achievable in high performance designs only
               through the use of dedicated circuitry for error reduction. A
               power and area efficient externally pipelined metastability
               error reduction technique for flash converters has been
               developed. Unresolved comparator outputs are held valid,
               causing the encode logic to fail benignly in the presence of
               metastability. In an n bit converter, errors are passed as a
               single unsettled bit to the converter output and are reduced
               with an external pipeline of only n latches per stage rather
               than an internal pipeline of 2^n-1 latches per stage.
               An 80-MHz, externally pipelined, 7-bit flash
               analog-to-digital converter was fabricated in 1.2-um CMOS.
               Measured error rates were less than 10^-12 errors/cycle.
               Using internal pipelining with two levels of 127 latches to
               achieve equivalent performance would require 3.48 times more
               power for the error reduction circuitry with a Nyquist
               frequency input. This corresponds to a reduction in the total
               power for the implemented converter of 1.24 times compared
               with the internally pipelined converter.
               In synchronizers and arbiters, general purpose applications
               require mean time between failures on the order of one per
               year or tens of years. Comparison of previous designs has
               been difficult due to varying technologies, test setups, and
               test conditions. To address this problem, a test circuit for
               synchronizers was implemented in 2-um and 1.2-um CMOS
               technologies. Using the test setup, the evaluation and
               comparison of synchronizer performance in varying
               environments and technologies is possible. The effects of
               loading, output buffering, supply scaling, supply noise, and
               technology scaling on synchronizer performance are discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/671/CSL-TR-95-671.pdf

%R CSL-TR-95-672
%Z Wed, 28 Jun 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Delay Models for CMOS Circuits
%A McFarland, Grant
%A Flynn, Michael
%D June 1995
%X Four different CMOS inverter delay models are derived and
               compared. It is shown that inverter delay can be estimated
               with fair accuracy over a wide range of input rise times and
               loads as the sum of two terms, one proportional to the input
               rise time, and one proportional to the capacitive load.
               Methods for estimating device capacitance from HSPICE
               parameters are presented, as well as means of including added
               delay due to wire resistance and the use of series
               transistors.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/672/CSL-TR-95-672.pdf

%R CSL-TR-95-665
%Z Wed, 02 Aug 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Interprocedural Parallelization Analysis: Preliminary Results
%A Hall, Mary W.
%A Amarasinghe, Saman P.
%A Murphy, Brian R.
%A Liao, Shih-Wei
%A Lam, Monica S.
%D March 1995
%X This paper describes a fully interprocedural automatic
               parallelization system for Fortran programs, and presents the
               results of extensive experiments obtained using this system.
               The system incorporates a comprehensive and integrated
               collection of analyses including dependence, privatization
               and reduction recognition for both array and scalar
               variables, and scalar symbolic analysis to support these. All
               the analyses have been implemented in the SUIF (Stanford
               University Intermediate Format) compiler system, with the aid
               of an interprocedural analysis construction tool known as
               FIAT. Our interprocedural analysis is uniquely designed to
               provide the same quality of information as if the program
               were analyzed as a single procedure, while managing the
               complexity of the analysis.
               We have implemented a robust system that has parallelized,
               completely automatically, loops containing over a thousand
               lines of code. This work makes possible the first
               comprehensive empirical evaluation of state-of-the-art
               automatic parallelization technology. This paper reports
               evaluation numbers on programs from standard benchmark
               suites. The results demonstrate that all the interprocedural
               analyses taken together can substantially advance the
               capability of current automatic parallelization technology.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/665/CSL-TR-95-665.pdf

%R CSL-TR-95-673
%Z Thu, 27 Jul 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Informing Loads: Enabling Software to Observe and React to
               Memory Behavior
%A Horowitz, Mark
%A Martonosi, Margaret
%A Mowry, Todd C.
%A Smith, Michael D.
%D July 1995
%X Memory latency is an important bottleneck in system
               performance that cannot be adequately solved by hardware
               alone. Several promising software techniques have been shown
               to address this problem successfully in specific situations.
               However, the generality of these software approaches has been
               limited because current architectures do not provide a
               fine-grained, low-overhead mechanism to observe memory
               behavior directly. To fill this need, we propose a new set of
               memory operations called informing memory operations, and in
               particular, we describe the design and functionality of an
               informing load instruction. This instruction serves as a
               primitive that allows the software to observe cache misses
               and to act upon this information inexpensively (i.e. under
               the miss, when the processor would typically be idle) within
               the current software context.
               Informing loads enable new solutions to several important
               software problems. We demonstrate this through examples that
               show their usefulness in (i) the collection of fine-grained
               memory profiles with high precision and low overhead and (ii)
               the automatic improvement of memory system performance
               through compiler techniques that take advantage of cache-miss
               information. Overall, we find that the apparent benefit of
               an informing load instruction is quite high, while the
               hardware cost of this functionality is quite modest. In
               fact, the bulk of the required hardware support is already
               present in today's high-performance processors.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/673/CSL-TR-95-673.pdf

%R CSL-TR-95-674
%Z Wed, 26 Jul 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Three Concepts of System Architecture
%A Luckham, David C.
%A Vera, James
%A Meldal, Sigurd
%D July 1995
%X An architecture is a specification of the components of a
               system and the communication between them. Systems are
               constrained to conform to an architecture. An architecture
               should guarantee certain behavioral properties of a
               conforming system, i.e., one whose components are configured
               according to the architecture. An architecture should also be
               useful in various ways during the process of building a
               system.
               This paper presents three alternative concepts of
               architecture: object connection architecture, interface
               connection architecture, and plug and socket architecture. We
               describe different concepts of interface and connection that
               are needed for each of the three kinds of architecture, and
               different conformance requirements of each kind. Simple
               examples are used to compare the usefulness of each kind of
               architecture in guaranteeing properties of conforming
               systems, and in correctly modifying a conforming system.
               In comparing the three architecture concepts the principle of
               communication integrity becomes central, and two new
               architecture concepts, duality of sub-interfaces (services)
               and connections of dual services (service connection), are
               introduced to define plug and socket architecture. We
               describe how these concepts reduce the complexity of
               architecture definitions, and can in many cases help
               guarantee that the components of a conforming system
               communicate correctly. The paper is presented independently
               of any particular formalism, since the concepts can be
               represented in widely differing architecture definition
               formalisms, varying from graphical languages to event-based
               simulation languages.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/674/CSL-TR-95-674.pdf

%R CSL-TR-95-675
%Z Thu, 27 Jul 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An Analysis of Division Algorithms and Implementations
%A Oberman, Stuart F.
%A Flynn, Michael J.
%D July 1995
%X Floating-point division is generally regarded as a low
               frequency, high latency operation in typical floating-point
               applications. However, the increasing emphasis on high
               performance graphics and the industry-wide usage of
               performance benchmarks forces processor designers to pay
               close attention to all aspects of floating-point computation.
               Many algorithms are suitable for implementing division in
               hardware. This paper presents four major classes of
               algorithms in a unified framework, namely digit recurrence,
               functional iteration, very high radix, and variable latency.
               Digit recurrence algorithms, the most common of which is SRT,
               use subtraction as the fundamental operator, and they
               converge to a quotient linearly. Division by functional
               iteration converges to a quotient quadratically using
               multiplication. Very high radix division algorithms are
               similar to digit recurrence algorithms, but they incorporate
               multiplication to reduce the latency. Variable latency
               division algorithms reduce the average latency to form the
               quotient. These algorithms are explained and compared in this
               work. It is found that for low-cost implementations where
               chip area must be minimized, digit recurrence algorithms are
               suitable. An implementation of division by functional
               iteration can provide the lowest latency for typical
               multiplier latencies. Variable latency algorithms show
               promise for simultaneously minimizing average latency while
               also minimizing area.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/675/CSL-TR-95-675.pdf

%R CSL-TR-95-676
%Z Thu, 31 Aug 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The COOL Parallel Programming Language: Design,
               Implementation, and Performance
%A Chandra, Rohit
%D January 1995
%X Effective utilization of multiprocessors requires that a
               program be partitioned for parallel execution, and that it
               execute with good data locality and load balance. Although
               automatic compiler-based techniques to address these concerns
               are attractive, they are often limited by insufficient
               information about the application. Explicit programmer
               participation is therefore necessary for programs that
               exploit unstructured task-level parallelism. However, support
               for such intervention must address the tradeoff between ease
               of use and providing a sufficient degree of control to the
               programmer.
               In this thesis we present the programming language COOL, that
               extends C++ with simple and efficient constructs for writing
               parallel programs. COOL is targeted towards programming
               shared-memory multiprocessors. Our approach emphasizes the
               integration of concurrency and synchronization with data
               abstraction. Concurrent execution is expressed through
               parallel functions that execute asynchronously when invoked.
               Synchronization for shared objects is expressed through
               monitors, and event synchronization is expressed through
               condition variables. This approach provides several benefits.
               First, integrating concurrency with data abstraction allows
               construction of concurrent data structures that have most of
               the complex details suitably encapsulated. Second, monitors
               and condition variables integrated with objects offer a
               flexible set of building blocks that can be used to build
               more complex synchronization abstractions. Synchronization
               operations are clearly identified through attributes and can
               be optimized by the compiler to reduce synchronization
               overhead. Finally, the object framework supports abstractions
               to improve the load distribution and data locality of the
               program.
               Besides these mechanisms for exploiting parallelism, COOL
               also provides support for the programmer to address the
               performance issues, in the form of abstractions that can be
               used to supply hints about the objects referenced by parallel
               tasks. These hints are used by the runtime system to schedule
               tasks close to the objects they reference, and thereby
               improve data locality. The hints are easily supplied by the
               programmer in terms of the objects in the program, while the
               details of task creation and scheduling are managed
               transparently within the runtime system. Furthermore, the
               hints do not affect the semantics of the program and allow
               the programmer to easily experiment with different
               optimizations.
               COOL has been implemented on several shared-memory machines,
               including the Stanford DASH multiprocessor. We have
               programmed a variety of applications in COOL, including many
               from the SPLASH parallel benchmark suite. Our experience has
               been promising: the applications are easily expressed in
               COOL, and perform as well as hand-tuned codes using
               lower-level primitives. Furthermore, supplying hints has
               proven to be an easy and effective way of improving program
               performance. This thesis therefore demonstrates that (a) the
               simple but powerful constructs in COOL can effectively
               exploit task-level parallelism across a variety of
               application programs, (b) an object-based approach improves
               both the expressiveness and the performance of parallel
               programs, and (c) improving data locality can be simple
               through a combination of programmer abstractions and smart
               scheduling mechanisms.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/676/CSL-TR-95-676.pdf

%R CSL-TR-95-677
%Z Mon, 11 Sep 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T SPARC-V9 Architecture Specification with Rapide
%A Santoro, Alexandre
%A Park, Woosang
%A Luckham, David
%D September 1995
%X This report presents an approach to creating an executable
               standard for the SPARC-V9 instruction set architecture using
               Rapide-1.0, a language for modeling and prototyping
               distributed systems. It describes the desired characteristics
               of a formal specification of the architecture and shows how
               Rapide can be used to build a model with these
               characteristics. This is followed by the description of a
               simple prototype of the proposed model, and a discussion of
               the issues involved in building and testing the complete
               specification (with emphasis on some Rapide-specific features
               such as constraints, causality and mapping). The report
               concludes with a brief evaluation of the proposed model and
               suggestions on future areas of research.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/677/CSL-TR-95-677.pdf

%R CSL-TR-95-679
%Z Thu, 30 Nov 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Measuring the Complexity of SRT Tables
%A Oberman, Stuart F.
%A Flynn, Michael J.
%D November 1995
%X This paper presents an analysis of the complexity of
               quotient-digit selection tables in SRT division
               implementations. SRT dividers use a fixed number of partial
               remainder and divisor bits to consult a table to select the
               next quotient-digit in each iteration. The complexity of
               these tables is a function of the radix, the redundancy, and
               the number of bits in the estimates of the divisor and
               partial remainder. This analysis derives the allowable
               divisor and partial remainder truncations for radix 2 through
               radix 32, and it quantifies the relationship between table
               parameters and the number of product terms in the logic
               equations defining the tables. By mapping the tables to a
               library of standard-cells, delay and area values were
               measured and are presented for table configurations through
               radix 32. The results show that: 1) Gray-coding of the
               quotient-digits allows for the automatic minimization of the
               quotient-digit selection logic equations. 2) Using a short
               carry-assimilating adder with a few more input bits than
               output bits can reduce table complexity. 3) Reducing the
               number of bits in the partial remainder estimate and
               increasing the length of the divisor estimate increases the
               size and delay of the table, offsetting any performance gain
               due to the shorter external adder. 4) While delay increases
               nearly linearly with radix, area increases quadratically,
               limiting practical table implementations to radix 2 and radix
               4.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/679/CSL-TR-95-679.pdf

%R CSL-TR-95-681
%Z Wed, 24 Jan 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Netlist Processing for Custom VLSI via Pattern Matching
%A Chanak, Thomas Stephen
%D November 1995
%X A vast array of CAD tools are available to support the design
               of integrated circuits. Unfortunately, tool development lags
               advances in technology and design methodology - the newest,
               most aggressive custom chips confront design issues that were
               not anticipated by the currently available set of tools. When
               existing tools cannot fill a custom design's needs, a new
               tool must be developed, often in a hurry. This situation
               arises fairly often, and many of the tools created use, or
               imply, some method of netlist pattern recognition. If the
               pattern-oriented facet of these tools could be isolated and
               unified among a variety of tools, custom tool writers would
               have a useful building block to start with when confronted
               with the urgent need for a new tool.
               Starting with the UNIX pattern-matching, text-processing tool
               AWK as a model, a pattern-action processing environment was
               built to test the concept of writing CAD tools by specifying
               patterns and actions. After implementing a wide variety of
               netlist processing applications, the refined pattern-action
               system proved to be a useful and fast way to implement new
               tools. Previous work in this area had reached the same
               conclusion, demonstrating the usefulness of pattern
               recognition for electrical rules checking, simulation,
               database conversion, and more. Our experiments identified a
               software building block, the "pattern object", that can
               construct the operators proposed in other works while
               maintaining flexibility in the face of changing requirements
               through the decoupling of global control from a pattern
               matching engine.
               The implicit computation of subgraph isomorphism common to
               pattern matching systems was thought to be a potential
               runtime performance issue. Our experience contradicts this
               concern. VLSI netlists tend to be sparse enough that runtimes
               do not grow unreasonably when a sensible amount of care is
               taken. Difficulties with the verification of pattern based
               tools, not performance, present the greatest obstacle to
               pattern-matching tools.
               Pattern objects that modify netlists raise the prospect of
               order dependencies and subtle interactions among patterns,
               and this interaction is what causes the most difficult
               verification problems. To combat this problem, a technique
               that considers an application's entire set of pattern objects
               and a specific target netlist together can perform analyses
               that expose otherwise subtle errors. This technique, along
               with debugging tools built specifically for pattern objects
               and netlists, allows the construction of trustworthy
               applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/681/CSL-TR-95-681.pdf

%R CSL-TR-95-682
%Z Wed, 07 Feb 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T High Performance Cache Architectures to Support Dynamic
               Superscalar Microprocessors
%A Wilson, Kenneth M.
%A Olukotun, Kunle
%D June 1995
%X Simple cache structures are not sufficient to provide the
               memory bandwidth needed by a dynamic superscalar computer, so
               more sophisticated memory hierarchies such as non-blocking
               and pipelined caches are required. To provide direction for
               the designers of modern high performance microprocessors, we
               investigate the performance tradeoffs of the combinations of
               cache size, blocking and non-blocking caches, and pipeline
               depth of caches within the memory subsystem of a dynamic
               superscalar processor for integer applications. The results
               show that the dynamic superscalar processor can hide about
               two-thirds of the additional latency of two and three
               pipelined caches, and that a non-blocking cache is always
               beneficial. A pipelined cache will only outperform a
               non-pipelined cache if the miss penalty and miss rates are
               large.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/682/CSL-TR-95-682.pdf

%R CSL-TR-95-683
%Z Mon, 22 Jan 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Comparison of Hardware Prefetching Techniques For
               Multimedia Benchmarks
%A Z ucker, Daniel F.
%A Flynn, Michael J.
%A Lee, Ruby B.
%D December 1995
%X Data prefetching is a well known technique for improving
               cache performance. While several studies have examined
               prefetch strategies for scientific and commercial
               applications, no published work has studied the special
               memory requirements of multimedia applications. This paper
               presents data for three types of hardware prefetching
               schemes: stream buffers, stride prediction tables, and a
               hybrid combination of the two, the stream cache. Use of the
               stride prediction table is shown to eliminate up to 90% of
               the misses that would otherwise be incurred in a moderate or
               large sized cache with no prefetching hardware. The stream
               cache, proposed for the first time in this paper, has the
               potential to cut execution times by more than half by the
               addition of a relatively small amount of additional hardware.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/683/CSL-TR-95-683.pdf

%R CSL-TR-95-684
%Z Mon, 22 Jan 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance/Area Tradeoffs in Booth Multipliers
%A Al-Twaijry, Hesham
%A Flynn, Michael J.
%D November 1995
%X Booth encoding is a method of reducing the number of summands
               required to produce the multiplication result. This paper
               compares the performance/area tradeoffs for the different
               Booth algorithms when trees are used as the summation
               network. This paper shows that the simple non-Booth algorithm
               is not a viable design, and that currently Booth 2 is the
               best design. It also points out that in the future Booth 3
               may offer the best performance/area ratio.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/684/CSL-TR-95-684.pdf

%R CSL-TR-95-686
%Z Wed, 14 Feb 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic Synthesis of Burst-Mode Asynchronous Controllers
%A Nowick, Steven Mark
%D December 1995
%X Asynchronous design has enjoyed a revival of interest
               recently, as designers seek to eliminate penalties of
               traditional synchronous design. In principle, asynchronous
               methods promise to avoid overhead due to clock skew,
               worst-case design assumptions and resynchronization of
               asynchronous external inputs. In practice, however, many
               asynchronous design methods suffer from a number of problems:
               unsound algorithms (implementations may have hazards), harsh
               restrictions on the range of designs that can be handled
               (single-input changes only), incompatibility with existing
               design styles and inefficiency in the resulting circuits.
               This thesis presents a new locally-clocked design method for
               the synthesis of asynchronous controllers. The method has
               been automated, is proven correct and produces
               high-performance implementations which are hazard-free at the
               gate-level. Implementations allow multiple-input changes and
               handle a relatively unconstrained class of behaviors (called
               "burst-mode" specifications). The method produces
               state-machine implementations with a minimal or near-minimal
               number of states. Implementations can be easily built in such
               common VLSI design styles as gate-array, standard cell and
               full-custom. Realizations typically have the latency of their
               combinational logic.
               A complete set of state and logic minimization algorithms has
               been developed and automated for the synthesis method. The
               logic minimization algorithm differs from existing algorithms
               since it generates two-level minimized logic which is also
               hazard-free.
               The synthesis program is used to produce competitive
               implementations for several published designs. In addition, a
               large real-world controller is designed as a case study: an
               asynchronous second-level cache controller for a new RISC
               processor.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/686/CSL-TR-95-686.pdf

%R CSL-TR-95-678
%Z Mon, 11 Sep 95 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Fast Volume Rendering Using a Shear-Warp Factorization of the
               Viewing Transformation
%A Lacroute, Philippe
%D September 1995
%X Volume rendering is a technique for visualizing 3D arrays of
               sampled data. It has applications in areas such as medical
               imaging and scientific visualization, but its use has been
               limited by its high computational expense. Early
               implementations of volume rendering used brute-force
               techniques that require on the order of 100 seconds to render
               typical data sets on a workstation. Algorithms with
               optimizations that exploit coherence in the data have reduced
               rendering times to the range of ten seconds but are still not
               fast enough for interactive visualization applications. In
               this thesis we present a family of volume rendering
               algorithms that reduces rendering times to one second.
               First we present a scanline-order volume rendering algorithm
               that exploits coherence in both the volume data and the
               image. We show that scanline-order algorithms are
               fundamentally more efficient than commonly-used ray casting
               algorithms because the latter must perform analytic geometry
               calculations (e.g. intersecting rays with axis-aligned
               boxes). The new scanline-order algorithm simply streams
               through the volume and the image in storage order. We
               describe variants of the algorithm for both parallel and
               perspective projections and a multiprocessor implementation
               that achieves frame rates of over 10 Hz.
               Second we present a solution to a limitation of existing
               volume rendering algorithms that use coherence accelerations:
               they require an expensive preprocessing step every time the
               volume is classified (i.e. when opacities are assigned to the
               samples), thereby limiting the usefulness of the algorithms
               for interactive applications. We introduce a data structure
               for encoding spatial coherence in unclassified volumes. When
               combined with our rendering algorithm this data structure
               allows us to build a fully-interactive volume visualization
               system.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/678/CSL-TR-95-678.pdf

%R CSL-TR-95-680
%Z Tue, 25 Jun 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Designing a Multicast Switch Scheduler
%A Prabhakar, Balaji
%A McKeown, Nick W.
%D November 1995
%X This paper presents the design of the scheduler for an M x N
               input-queued switch. It is assumed that each input maintains
               a single queue for arriving multicast cells and that only the
               cell at the head of line (HOL) can be observed and scheduled
               at one time. The scheduler is required to be work-conserving,
               which means that no output port may be idle as long as there
               is an input cell destined to it. Furthermore, the scheduler
               is required to be fair, which means that no input cell may be
               held at HOL for more than M cell times (M is the number of
               input ports). The aim is to find a work-conserving, fair
               policy that delivers maximum throughput and minimizes input
               queue latency.
               When a scheduling policy decides which cells to schedule,
               contention may require that it leave a residue of cells
               to be scheduled in the next cell time. The selection of where
               to place the residue uniquely defines the scheduling policy.
               It is demonstrated that a policy which always concentrates
               the residue, subject to our fairness constraint, always
               outperforms all other policies. We present one such policy,
               called TATRA, and analyze it geometrically. We also present a
               heuristic round-robin policy called mRRM that is simple
               to implement in hardware, fair, and performs quite well when
               compared to a concentrating algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/680/CSL-TR-95-680.pdf

%R CSL-TR-95-685
%Z Tue, 25 Jun 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Memory Consistency Models for Shared-Memory Multiprocessors
%A Gharachorloo, Kourosh
%D December 1995
%X The memory consistency model for a shared-memory
               multiprocessor specifies the behavior of memory with respect
               to read and write operations from multiple processors. As
               such, the memory model influences many aspects of system
               design, including the design of programming languages,
               compilers, and the underlying hardware. Relaxed models that
               impose fewer memory ordering constraints offer the potential
               for higher performance by allowing hardware and software to
               overlap and reorder memory operations. However, fewer
               ordering guarantees can compromise programmability and
               portability. Many of the previously proposed models either
               fail to provide reasonable programming semantics or are
               biased toward programming ease at the cost of sacrificing
               performance. Furthermore, the lack of consensus on an
               acceptable model hinders software portability across
               different systems.
               This dissertation focuses on providing a balanced solution
               that directly addresses the trade-off between programming
               ease and performance. To address programmability, we propose
               an alternative method for specifying memory behavior that
               presents a higher level abstraction to the programmer. We
               show that with only a few types of information supplied by
               the programmer, an implementation can exploit the full range
               of optimizations enabled by previous models. Furthermore, the
               same information enables automatic and efficient portability
               across a wide range of implementations.
               To expose the optimizations enabled by a model, we have
               developed a formal framework for specifying the low-level
               ordering constraints that must be enforced by an
               implementation. Based on these specifications, we present a
               wide range of architecture and compiler implementation
               techniques for efficiently supporting a given model. Finally,
               we evaluate the performance benefits of exploiting relaxed
               models based on detailed simulations of realistic parallel
               applications. Our results show that the optimizations enabled
               by relaxed models are extremely effective in hiding virtually
               the full latency of writes in architectures with blocking
               reads (i.e., processor stalls on reads), with gains as high
               as 80\%. Architectures with non-blocking reads can further
               exploit relaxed models to hide a substantial fraction of the
               read latency as well, leading to a larger overall performance
               benefit. Furthermore, these optimizations complement gains
               from other latency hiding techniques such as prefetching and
               multiple contexts.
               We believe that the combined benefits in hardware and
               software will make relaxed models universal in future
               multiprocessors, as is already evidenced by their adoption in
               several commercial systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/95/685/CSL-TR-95-685.pdf

%R CSL-TR-96-687
%Z Thu, 08 Feb 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Latency Tolerance for Dynamic Processors
%A Bennett, James E.
%A Flynn, Michael J.
%D January 1996
%X While a number of dynamically scheduled processors have
               recently been brought to market, work on hardware techniques
               for tolerating memory latency has mostly targeted statically
               scheduled processors. This paper attempts to remedy this
               situation by examining the applicability of hardware latency
               tolerance techniques to dynamically scheduled processors. The
               results so far indicate that the inherent ability of the
               dynamically scheduled processor to tolerate memory latency
               reduces the need for additional hardware such as stream
               buffers or stride prediction tables. However, the technique
               of victim caching, while not usually considered as a latency
               tolerating technique, proves to be quite effective in aiding
               the dynamically scheduled processor in tolerating memory
               latency. For a fixed size investment in microprocessor chip
               area, the victim cache outperforms both stream buffers and
               stride prediction.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/687/CSL-TR-96-687.pdf

%R CSL-TR-96-688
%Z Tue, 13 Feb 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T OS Support for Improving Data Locality on CC-NUMA Compute
               Servers
%A Verghese, Ben
%A Devine, Scott
%A Gupta, Anoop
%A Rosenblum, Mendel
%D February 1996
%X The dominant architecture for the next generation of
               cache-coherent shared-memory multiprocessors is CC-NUMA
               (cache-coherent non-uniform memory architecture). These
               machines are attractive as compute servers, because they
               provide transparent access to local and remote memory.
               However, the access latency to remote memory is 3 - 5 times
               the latency to local memory. Given the large remote access
               latencies, data locality is potentially the most important
               performance issue. In compute-server workloads, when moving
               processes between nodes for load balancing, to maintain data
               locality the OS needs to do page-migration and
               page-replication. Through trace-analysis and actual runs of
               realistic workloads, we study the potential improvements in
               performance provided by OS supported dynamic migration and
               replication. Analyzing our kernel-based implementation of the
               policy, we provide a detailed breakdown of the costs and
               point out the functions using the most time. We study
               alternatives to using full-cache miss information to drive
               the policy, and show that sampling of cache misses can be
               used to reduce cost without compromising performance, and
               that TLB misses are inconsistent as an approximation for
               cache misses. Finally, our workload runs show that OS
               supported dynamic page-migration and page-replication can
               substantially increase performance, as much as 29%, in some
               workloads.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/688/CSL-TR-96-688.pdf

%R CSL-TR-96-689
%Z Tue, 20 Feb 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Variable Latency Pipelined Floating-Point Adder
%A Oberman, Stuart F.
%A Flynn, Michael J.
%D February 1996
%X Addition is the most frequent floating-point operation in
               modern microprocessors. Due to its complex
               shift-add-shift-round dataflow, floating-point addition can
               have a long latency. To achieve maximum system performance,
               it is necessary to design the floating-point adder to have
               minimum latency, while still providing maximum throughput.
               This paper proposes a new floating-point addition algorithm
               which exploits the ability of dynamically-scheduled
               processors to utilize functional units which complete in
               variable time. By recognizing that certain operand
               combinations do not require all of the steps in the complex
               addition dataflow, the average latency is reduced. Simulation
               on SPECfp92 applications demonstrates that a speedup in
               average addition latency of 1.33 can be achieved using this
               algorithm, while still maintaining single cycle throughput.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/689/CSL-TR-96-689.pdf

%R CSL-TR-96-691
%Z Wed, 13 Mar 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T PPP: A Gate-Level Power Simulator - A World Wide Web
               Application
%A Bogliolo, Alessandro
%A Benini, Luca
%A DeMicheli, Giovanni
%A Ricco, Bruno
%D March 1996
%X Power consumption is an increasingly important constraint for
               complex ICs. Accurate and efficient power estimations are
               required at any level of abstraction to steer the design
               process.
               PPP is a Web-based integrated environment for synthesis and
               simulation of low-power CMOS circuits. We describe the
               simulation engine of PPP and we propose a new paradigm for
               tool integration.
               The simulation engine of PPP is a gate-level simulator that
               achieves accuracy comparable with electrical simulation,
               while keeping performance competitive with traditional
               gate-level techniques. This is done by using advanced
               symbolic models of the basic library cells, that exploit the
               understanding of the main phenomena involved in power
               consumption. In order to maintain full compatibility with
               gate-level design tools, we use VERILOG-XL as simulation
               platform. The accuracy obtained on benchmark circuits is
               always within 6% from SPICE also for
               single-gate/single-pattern power analysis, thus providing the
               local information needed to optimize the design.
               Interface and tool integration issues have been addressed
               using a Web-based approach. The graphical interface of PPP is
               a dynamically generated tree of interactive HTML pages that
               allow the user to access and execute the tool through the
               Internet by using his/her own Web-browser. No software
               installation is required and all the details of data transfer
               and tool communication are hidden to the user.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/691/CSL-TR-96-691.pdf

%R CSL-TR-96-690
%Z Mon, 01 Apr 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Analysis and Synthesis of Concurrent Digital Systems Using
               Control-Flow Expressions
%A Coelho, Claudionor Jose Nunes Jr.
%D March 1996
%X We present in this thesis a modeling style and control
               synthesis technique for system-level specifications that are
               better described as a set of concurrent descriptions, their
               synchronizations and complex constraints. For these types of
               specifications, conventional synthesis tools will not be able
               to enforce design constraints because these tools are
               targeted to sequential components with simple design
               constraints.
               In order to generate controllers satisfying the constraints
               of system-level specifications, we propose a synthesis tool
               called Thalia that considers the degrees of freedom
               introduced by the concurrent models and by the system's
               environment.
               The synthesis procedure will be subdivided into the following
               steps: We first model the specification in an algebraic
               formalism called control-flow expressions, that considers
               most of the language constructs used to model systems
               reacting to their environment, i.e. sequential, alternative,
               concurrent, iterative, and exception handling behaviors. Such
               constructs are found in languages such as C, Verilog HDL,
               VHDL, Esterel and StateCharts.
               Then, we convert this model and a suitable representation for
               the environment into a finite-state machine, where the system
               is analyzed, and design constraints such as timing, resource
               and synchronization are incorporated.
               In order to generate the control-units for the design, we
               present two scheduling procedures. The first procedure,
               called static scheduling, attempts to find fixed schedules
               for operations satisfying system-level constraints. The
               second procedure, called dynamic scheduling, attempts to
               synchronize concurrent parts of a circuit description by
               dynamically selecting schedules according to a global view of
               the system.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/690/CSL-TR-96-690.pdf

%R CSL-TR-96-694
%Z Mon, 22 Apr 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Analysis and Synthesis of Concurrent Digital Circuits Using
               Control-Flow Expressions
%A Coelho, Claudionor Nunes Jr.
%A DeMicheli, Giovanni
%D April 1996
%X We present in this paper a novel modeling style and control
               synthesis technique for system-level specifications that are
               better described as a set of concurrent descriptions, their
               synchronizations and constraints. The proposed synthesis
               procedure considers the degrees of freedom introduced by the
               concurrent models and by the environment in order to satisfy
               the design constraints.
               Synthesis is divided in two phases. In the first phase, the
               original specification is translated into an algebraic
               system, for which complex control-flow constraints and
               quantifiers of the design are introduced. In the second
               phase, we translate the algebraic formulation into a
               finite-state representation, and we derive an optimal
               control-unit implementation for each individual concurrent
               part. In the implementation of the controllers from the
               finite-state representation, we use flexible objective
               functions, which allows designers to better control the goals
               of the synthesis tool, and thus incorporate as much as
               possible their knowledge about the environment and the
               design.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/694/CSL-TR-96-694.pdf

%R CSL-TR-96-696
%Z Mon, 10 Jun 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Computer Assisted Analysis of Multiprocessor Memory Systems
%A Park, Seungjoon
%D June 1996
%X In a shared memory multiprocessor architecture, a memory
               model describes the behavior of the memory system as observed
               at the user-level. A cache coherence protocol aims to conform
               to a memory model by maintaining consistency among the
               multiple copies of cached data and the data in main memory.
               Memory models and cache coherence protocols can be quite
               complex and subtle, creating a real possibility of
               misunderstandings and actual design errors. In this thesis,
               we will present solutions to these problems.
               Though weaker memory models for multiprocessor systems allow
               higher-performance implementation techniques, they are also
               very subtle. Hence, it is vital to specify memory models
               precisely and to verify that the programs running under a
               memory model satisfy desired properties. Our approach to
               these problems is to write an executable specification of the
               memory model using a high-level description language for
               concurrent systems. This executable description provides a
               precise specification of the machine architecture for
               implementors and programmers. Moreover, the availability of
               formal verification tools allows users to experiment with the
               effects of the memory model on small assembly-language
               routines. Running the verifier can be very effective at
               clarifying the subtle details of the models and
               synchronization routines.
               Cache coherence protocols, like other protocols for
               distributed systems, simulate atomic transactions in
               environments where atomic implementations are impossible.
               Based on this observation, we propose a verification method
               which compares an implementation with a specification
               representing the desired abstract behavior. The comparison is
               done through an aggregation function, which maps the sequence
               of implementation steps for each transaction to the
               corresponding transaction step in the specification. The
               aggregation approach is applied to verification of the cache
               coherence protocol in the FLASH multiprocessor system. The
               protocol, consisting of more than a hundred implementation
               steps, is proved to conform to a reduced description with six
               kinds of atomic transactions. From the reduced behavior, it
               is very easy to prove crucial properties of the protocol
               including data consistency of cached copies. The aggregation
               method is also used to prove that the reduced protocol
               satisfies a desired memory consistency model.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/696/CSL-TR-96-696.pdf

%R CSL-TR-96-692
%Z Wed, 12 Jun 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Delay Balancing of Wave Pipelined Multiplier Counter Trees
               Using Pass Transistor Multiplexers
%A Kishigami, Hidechika
%A Nowka, Kevin J.
%A Flynn, Michael J.
%D January 1996
%X Wave pipelining is an attractive technique used in high-speed
               digital circuits to speed-up pipeline clock-rate by
               eliminating the synchronizing elements between pipeline
               stages.
               Wave-pipelining has been successfully applied to the design
               of CMOS multipliers which have demonstrated speed-ups of
               clock-rate 4 to 7 times over their non-pipelined design. In
               order to achieve high clock-rate by using wave-pipelining
               techniques, it is necessary to equalize (balance) all signal
               path delay of the circuit. In an earlier study a multiplier
               was designed by using only 2-inputs NAND gates and inverters
               as primitives in order to reduce delay variations of the
               circuit.
               Alternatively, there are several reports that use
               pass-transistor logic as primitives for multipliers to
               achieve very low latency. Pass-transistor logic seems
               attractive for reducing circuit delay variations.
               In this report we describe a design of wave-pipelined counter
               tree, which is a central part of parallel multiplier, and
               detail a method to balance the delay of (4,2) counter using
               pass-transistor multiplexers (PTMs) as primitives to achieve
               both higher clock-rate and smaller latency. Simulations of
               the wave-pipelined counter tree demonstrated 0.8ns clock-rate
               and 2.33ns latency through the use of pass-transistor
               multiplexers (PTMs) for a 0.8$\mu$m CMOS process. This data
               suggests that using pass-transistor multiplexers as primitives
               for wave-pipelined circuits is useful to achieve both higher
               clock-rate and lower latency.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/692/CSL-TR-96-692.pdf

%R CSL-TR-96-693
%Z Wed, 12 Jun 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T High-Performance CMOS System Design Using Wave Pipelining
%A Nowka, Kevin J.
%D January 1996
%X Wave pipelining, or maximum rate pipelining, is a circuit
               design technique that allows digital synchronous systems to
               be clocked at rates higher than can be achieved with
               conventional pipelining techniques. It relies on the
               predictable finite signal propagation delay through
               combinational logic for virtual data storage. Wave pipelining
               of combinational circuits has been shown to achieve clock
               rates 2 to 7-times those possible for the same circuits with
               conventional pipelining.
               Conventional pipelined systems allow data to propagate from a
               register through the combinational network to another
               register prior to initiating the subsequent data transfer.
               Thus, the maximum operating frequency is determined by the
               maximum propagation delay through the longest pipeline stage.
               Wave pipeline systems apply the subsequent data to the network 
               as soon as it can be guaranteed that it will not interfere with
               the current data wave. The maximum operating frequency of a
               wave pipeline is therefore determined by the difference
               between the maximum propagation delay and the minimum
               propagation delay through the combinational logic.
               By minimizing variations in delay, the performance of wave
               pipelining is maximized. Data wave interference in CMOS VLSI
               circuits is the result of the variation in the propagation
               delay due to path length differences, differences in the
               state of the network inputs and intermediate nodes, and
               difference in fabrication and environmental conditions.
               To maximize the performance of wave pipelined circuits, the
               path length variations through the combinational logic must
               be minimized. A method of modifying the transistor geometries
               of individual static CMOS gates so as to tune their delays
               has been developed. This method is used by CAD tools that
               minimize the path length variation. These tools are used to
               equalize delays within a wave pipelined logic block and to
               synchronize separate wave pipelined units which share a
               common reference clock. This method has been demonstrated to
               limit the variation in delay of CMOS circuits to less than
               20%.
               Delay models have demonstrated that temperature variation,
               supply power variations, and noise limit the number of
               concurrent waves in CMOS wave pipelined systems to three or
               less.
               Run-to-run process variation can have a significant impact on
               CMOS VLSI signal propagation delay. The ratio of maximum to
               minimum delay along the same path for seven different runs of
               a 0.8-micron feature size fabrication process was found to be
               1.35. Unless this variation is controlled, the speedup of
               wave pipelining is limited to two to three to ensure that
               devices from any of these runs will operate. When aggregated
               with variations due to environmental factors, the maximum
               speed-up of a wave pipeline is less than two.
               To counteract the effects of process variation, an adaptive
               supply voltage technique has been developed. An on-chip
               detector circuit determines when delays are faster than the
               nominal delays and the power supply is lowered accordingly.
               In this manner, ICs fabricated with fast processes are run at
               a lower supply voltage to ensure correct operation at the
               design target frequency.
               To demonstrate that wave pipeline technology can be applied
               to VLSI system design, a CMOS wave pipelined vector unit has
               been developed. Extensive use of wave pipelining was employed
               to achieve high clock rates in the functional units. The VLSI
               processor consists of a wave pipelined vector register file,
               a wave pipelined adder, a wave pipelined multiplier, load and
               store units, an instruction buffer, a scoreboard, and control
               logic. The VLSI vector unit contains approximately 47000
               transistors and occupies an area of 43 sq mm. It has been
               fabricated in a 0.8-micron CMOS technology. Tests indicate
               wave pipelined operation at a maximum rate of 303MHz.
               An equivalent vector unit design using traditional
               latch-based pipelining was designed and simulated. The
               latch-based design occupied 2% more die area, operated with a
               35% longer clock period, and had multiply latency 8% longer
               and add latency 11% longer than the wave pipelined vector
               unit.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/693/CSL-TR-96-693.pdf

%R CSL-TR-96-695
%Z Wed, 12 Jun 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Producer-Oriented versus Consumer-Oriented Prefetching: a
               Comparison and Analysis of Parallel Application Programs
%A Ohara, Moriyoshi
%D June 1996
%X Due to large remote-memory latencies, reducing the impact of
               cache misses is critical for large scale shared-memory
               multiprocessors. This thesis quantitatively compares two
               classes of software-controlled prefetch schemes for reducing
               the impact: consumer-oriented and producer-oriented schemes.
               Examining the behavior of these schemes leads us to
               characterize the communication behavior of parallel
               application programs.
               Consumer-oriented prefetch has been shown to be effective for
               hiding large memory latencies. Producer-oriented prefetch
               (called deliver), on the other hand, has not been extensively
               studied. Our implementation of deliver uses a hardware
               mechanism that tracks the set of potential consumers based on
               past sharing patterns. Qualitatively, deliver has an
               advantage since the producer sends the datum as soon as, but
               not before, it is ready for use. In contrast, prefetch may
               fetch the datum too early so that it is invalidated before
               use, or may fetch it too late so that the datum is not yet
               available when it is needed by the consumer. Our simulation
               results indeed show that the qualitative advantage of deliver
               can yield a slight performance advantage when the cache size
               and the memory latency are very large. Overall, however,
               deliver turns out to be less effective than prefetch for two
               reasons. First, prefetch benefits from a "filtering effect,"
               and thus generates less traffic than deliver. Second, deliver
               suffers more from cache interference than prefetch. The
               sharing and temporal characteristics of a set of parallel
               applications are shown to account for the different behavior
               of the two prefetch schemes. This analysis shows the inherent
               difficulties in predicting future communication behavior of
               parallel applications from recent history of the application
               behavior. This suggests that cache accesses involved with
               coherency in general are much less predictable based on past
               behavior than other types of cache behavior.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/695/CSL-TR-96-695.pdf

%R CSL-TR-96-698
%Z Tue, 16 Jul 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Technology Scaling Effects on Multipliers
%A Al-Twaijry, Hesham
%A Flynn, Michael J.
%D July 1996
%X Booth encoding is a method of reducing the number of summands
               required to produce the multiplication result. This paper
               compares the performance/area tradeoffs for the different
               Booth algorithms when trees are used as the summation
               network. This paper shows that the simple non-Booth algorithm
               is not an efficient design, and that for small feature sizes
               the performance for the different Booth encoding schemes are
               comparable in terms of delay. The report also quantifies the
               effects of wires on the multiplier. As the feature size
               continues to decrease, wires will provide an ever increasing
               portion of the total delay. Booth 3 becomes more attractive
               since it is smaller.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/698/CSL-TR-96-698.pdf

%R CSL-TR-96-700
%Z Tue, 23 Jul 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Fast IEEE Rounding for Division by Functional Iteration
%A Oberman, Stuart F.
%A Flynn, Michael J.
%D July 1996
%X A class of high performance division algorithms is functional
               iteration. Division by functional iteration uses
               multiplication as the fundamental operator. The main
               advantage of division by functional iteration is quadratic
               convergence to the quotient. However, unlike non-restoring
               division algorithms such as SRT division, functional
               iteration does not directly provide a final remainder. This
               makes fast and exact rounding difficult. This paper clarifies
               the methodology for correct IEEE compliant rounding for
               quadratically-converging division algorithms. It proposes an
               extension to previously reported techniques of using extended
               precision in the computation to reduce the frequency of back
               multiplications required to obtain the final remainder.
               Further, a technique applicable to all IEEE rounding modes is
               presented which replaces the final subtraction for remainder
               computation with very simple combinational logic.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/700/CSL-TR-96-700.pdf

%R CSL-TR-96-699
%Z Mon, 09 Sep 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Efficient Multiprocessor Communications: Networks,
               Algorithms, Simulation, and Implementation
%A Lu, Yen-Wen
%D July 1996
%X As technology and processing power continue to improve,
               inter-processor communication becomes a performance
               bottleneck in a multiprocessor network. In this dissertation,
               an enhanced 2-D torus with segmented reconfigurable bus (SRB)
               to overcome the delay due to long distance communications was
               proposed and analyzed. A procedure of selecting an optimal
               segment length and segment alignment based on minimizing the
               lifetime of a packet and reducing the interaction between
               segments was developed to design a SRB network. Simulation
               shows that a torus with SRB is more than twice as efficient
               as a traditional torus.
               Efficient use of channel bandwidth is an important issue in
               improving network performance. The communication links
               between two adjacent nodes can be organized as a pair of
               opposite uni-directional channels, or combined into a single
               bi-directional channel. A modified channel arbitration scheme
               with hidden delay, called ``token-exchange,'' was designed
               for the bi-directional channel configuration. In spite of the
               overhead of channel arbitration, simulation shows that
               bi-directional channels have significantly better
               latency-throughput performance and can sustain higher data
               bandwidth relative to uni-directional channels of the same
               channel width. For example, under 2% hot-spot traffic,
               bi-directional channels can support 80% more bandwidth
               without saturation compared with uni-directional channels.
               An efficient, low power, wormhole data router chip for 2-D
               mesh and torus networks with bi-directional channels and
               token-exchange arbitration was designed and implemented. The
               token-exchange delay is fully hidden and no latency penalty
               occurs when there is no traffic contention; the
               token-exchange delay is also negligible when the contention
               is high. Distributed decoders and arbiters are provided for
               each of four IO ports, and a fully-connected 5x6 crossbar
               switch increases parallelism of data routing. The router also
               provides special hardware such as flexible header decoding
               and switching to support path-based multicasting. From
               measured results, multicasting with two destinations used
               only 1/3 of the energy required for unicasting. The wormhole
               router was fabricated using MOSIS/HP 0.6um technology. It
               delivers 1.6Gb/s (50MHz) @ Vdd=2.1V, consuming an average
               power of 15mW.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/699/CSL-TR-96-699.pdf

%R CSL-TR-96-701
%Z Thu, 29 Aug 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Characterization of Quality and Traffic for Various Video
               Encoding Schemes and Various Encoder Control Schemes
%A Dalgic, Ismail
%A Tobagi, Fouad A.
%D August 1996
%X Lossy video compression algorithms, such as those used in the
               H.261, MPEG, and JPEG standards, result in quality
               degradation seen in the form of digital tiling, edge
               busyness, and mosquito noise. The encoder parameters
               (typically, the so-called quantizer scale) can be adjusted to
               trade-off encoded video quality and bit rate. Clearly, when
               more bits are used to represent a given scene, the quality
               gets better. However, for a given set of encoder parameter
               values, both the generated traffic and the resulting quality
               depend on the scene content. Therefore, in order to achieve
               certain quality and traffic objectives at all times, the
               encoder parameters must be appropriately adjusted according
               to the scene content. Currently, two schemes exist for
               setting the encoder parameters. The most commonly used scheme
               today is called Constant Bit Rate (CBR), where the encoder
               parameters are controlled to achieve a target bit rate over
               time by considering a hypothetical rate control buffer at the
               encoder's output which is drained at the target bit rate; the
               buffer occupancy level is used as feedback to control the
               quantizer scale. In a CBR encoded video stream, the quality
               varies in time, since the quantizer scale is controlled to
               achieve a constant bit rate regardless of the scene
               complexity. In the other existing scheme, called Open-Loop
               Variable Bit Rate (OL-VBR), all encoder parameters are simply
               kept fixed at all times. The motivation behind this scheme is
               to presumably provide a more consistent video quality
               compared to CBR encoding. In this report, we characterize the
               traffic and quality for the CBR and OL-VBR schemes by using
               several video sequences of different spatial and temporal
               characteristics, encoded using the H.261, MPEG, and
               motion-JPEG standards. We investigate the effect of the
               controller parameters (i.e., for CBR, target bit rate and
               rate control buffer size, and for OL-VBR, the fixed quantizer
               scale) and video content on the resulting traffic and
               quality. We show that with the CBR and OL-VBR schemes, the
               encoder control parameters can be chosen so as to achieve or
               exceed a given quality objective at all times; however, this
               can only be done by producing more bits than needed during
               some of the scenes. In order to produce only as many bits as
               needed to achieve a given quality objective, we propose a
               video encoder control scheme which maintains the quality of
               the encoded video at a constant level, referred to as
               Constant Quality VBR (CQ-VBR). This scheme is based on a
               quantitative video quality metric which is used in a feedback
               control mechanism to adjust the encoder parameters. We
               determine the appropriate feedback functions for the H.261,
               MPEG, and motion-JPEG standards. We show that this scheme is
               indeed able to achieve a constant quality at all times;
               however, the resulting traffic occasionally contains bursts
               of relatively high-magnitude (5-10 times the average), but
               short duration (5-15 frames). We then introduce a
               modification to this scheme, where in addition to the
               quality, the peak rate of the traffic is also controlled. We
               show that with the modified scheme, it is possible to achieve
               nearly constant video quality while keeping the peak rate
               within 2-3 times the average.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/701/CSL-TR-96-701.pdf

%R CSL-TR-96-702
%Z Thu, 05 Sep 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance Evaluation of Ethernets and ATM Networks Carrying
               Video Traffic
%A Dalgic, Ismail
%A Tobagi, Fouad A.
%D August 1996
%X In this report the performance of Ethernets (10Base-T and
               100Base-T) and ATM networks carrying multimedia traffic is
               presented. End-to-end delay requirements suitable for a wide
               range of multimedia applications are considered (ranging from
               20 ms to 500 ms). Given the specific nature of the network
               considered and the maximum latency requirement, some data is
               lost. Data loss at the receiver causes quality degradations
               in the displayed video in the form of discontinuities,
               referred to as glitches. We define various quantities
               characterizing the glitches, namely, the total amount of
               information lost in glitches, their duration, and the rate at
               which glitches occur. We study these quantities for various
               network and traffic scenarios, using a computer simulation
               model driven by real video traffic generated by encoding
               video sequences. We also determine the maximum number of
               video streams that can be supported for given maximum delay
               requirement and glitch rate. We consider and compare the
               results for various types of video contents (video
               conferencing, motion pictures, commercials), two encoding
               schemes (H.261 and MPEG-1), and two encoder control schemes
               [Constant Bit Rate (CBR) and Constant-Quality Variable Bit
               Rate (CQ-VBR)], considering also scenarios where the traffic
               consists of various mixtures of the above. We show that when
               the video content is highly variable, both 100Base-T Ethernet
               and ATM can support many more CQ-VBR streams than CBR
               streams. When the video content is not much variable, as in a
               videoconferencing sequence, then the number of CBR and CQ-VBR
               streams that can be supported are comparable. For low values
               of end-to-end delay requirement, we show that ATM networks
               can support up to twice as many video streams of a given type
               as Ethernets for a channel capacity of 100Mb/s. For relaxed
               end-to-end delay requirements, both networks can support
               about the same number of video streams of a given type. We
               also determine the number of streams supportable for traffic
               scenarios consisting of mixtures of heterogeneous video
               traffic sources in terms of the video content, video encoding
               scheme and encoder control scheme, as well as the end-to-end
               delay requirement. We then consider multihop ATM network
               scenarios, and provide admission control guidelines for video
               when the network topology is an arbitrary mesh. Finally, we
               consider scenarios with mixtures of video and data traffic
               (with various degrees of burstiness), and determine the
               effect of one traffic type over the other.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/702/CSL-TR-96-702.pdf

%R CSL-TR-96-705
%Z Mon, 09 Sep 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Rapide: A Language and Toolset for Simulation of Distributed
               Systems by Partial Orderings of Events.
%A Luckham, David C.
%D September 1996
%X This paper describes the RAPIDE concepts of system
               architecture, causal event simulation, and some of the tools
               for viewing and analysis of causal event simulations.
               Illustration of the language and tools is given by a detailed
               small example.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/705/CSL-TR-96-705.pdf

%R CSL-TR-96-706
%Z Wed, 25 Sep 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Optimum Placement and Routing of Multiplier Partial Product
               Trees
%A Al-Twaijry, Hesham
%A Flynn, Michael J.
%D September 1996
%X An algorithm that builds a multiplier under the constraint of
               a limited number of wiring tracks is designed. The algorithm
               has been implemented. The program is then used to compare
               several designs of an IEEE floating point multiplier using
               several delay models.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/706/CSL-TR-96-706.pdf

%R CSL-TR-96-703
%Z Thu, 10 Oct 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Test Point Insertion for Non-Feedback Bridging Faults
%A Touba, Nur A.
%A McCluskey, Edward J.
%D August 1996
%X This paper studies pseudo-random pattern testing of bridging
               faults. Although bridging faults are generally more random
               pattern testable than stuck-at faults, examples are shown to
               illustrate that some bridging faults can be much less random
               pattern testable than stuck-at faults. A fast method for
               identifying these random-pattern-resistant bridging faults is
               described. It is shown that state-of-the-art test point
               insertion techniques, which are based on the stuck-at fault
               model, are inadequate. Data is presented which indicates that
               even after inserting test points that result in 100% single
               stuck-at fault coverage, many bridging faults are still not
               detected. A test point insertion procedure that targets both
               single stuck-at faults and non-feedback bridging faults is
               presented. It is shown that by considering both types of
               faults when selecting the location for test points, higher
               fault coverage can be obtained with little or no increase in
               overhead. Thus, the test point insertion procedure described
               here is a low-cost way to improve the quality of built-in
               self-test. While this paper considers only non-feedback
               bridging faults, the techniques that are described can be
               applied to feedback bridging faults in a straightforward
               manner.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/703/CSL-TR-96-703.pdf

%R CSL-TR-96-704
%Z Thu, 10 Oct 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Synthesis Techniques for Pseudo-Random Built-In Self-Test
%A Touba, Nur A.
%D August 1996
%X Built-in self-test (BIST) techniques enable an integrated
               circuit (IC) to test itself. BIST reduces test and
               maintenance costs for an IC by eliminating the need for
               expensive test equipment and by allowing fast location of
               failed ICs in a system. BIST also allows an IC to be tested
               at its normal operating speed which is very important for
               detecting timing faults. Despite all of these advantages,
               BIST has seen limited use in industry because of area and
               performance overhead and increased design time. This
               dissertation presents automated techniques for implementing
               BIST in a way that minimizes area and performance overhead. A
               low-overhead approach for BIST is to use a linear feedback
               shift register (LFSR) to apply pseudorandom test patterns to
               the circuit-under-test. Unfortunately, many circuits contain
               random-pattern-resistant faults which limit the fault
               coverage that can be obtained for pseudo-random BIST. Several
               different approaches for solving this problem are presented.
               A logic synthesis procedure that performs testability-driven
               factoring to generate a random pattern testable design is
               presented. By considering random pattern testability during
               the factoring process, the overhead can be minimized. For
               hand-designed circuits or circuits that are not
               synthesizable, an innovative test point insertion procedure
               is described for inserting test points to make the circuit
               random pattern testable. A path tracing procedure is used for
               test point placement. A few of the existing primary inputs
               are ANDed together to form signals that drive the control
               points. These innovations result in fewer test points than
               previous methods. If it is not possible or not desirable to
               modify the circuit-under-test, then a procedure is described
               for synthesizing mapping logic that can placed at the output
               of the LFSR to transform the pseudorandom patterns so that
               they provide the required fault coverage. Much less overhead
               is required compared with weighted pattern testing methods.
               Lastly, a technique is described for placing bitfixing logic
               at the serial output of an LFSR to embed deterministic test
               patterns for the random pattern resistant faults in the
               pseudorandom bit sequence. This method does not require any
               performance overhead beyond what is needed for scan.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/704/CSL-TR-96-704.pdf

%R CSL-TR-96-711
%Z Thu, 12 Dec 96 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design Issues in High Performance Floating Point Arithmetic
               Units
%A Oberman, Stuart Franklin
%D December 1996
%X In recent years computer applications have increased in their
               computational complexity. The industry-wide usage of
               performance benchmarks, such as SPECmarks, forces processor
               designers to pay particular attention to implementation of
               the floating point unit, or FPU. Special purpose
               applications, such as high performance graphics rendering
               systems, have placed further demands on processors. High
               speed floating point hardware is a requirement to meet these
               increasing demands. This work examines the state-of-the-art
               in FPU design and proposes techniques for improving the
               performance and the performance/area ratio of future FPUs.
               In recent FPUs, emphasis has been placed on designing
               ever-faster adders and multipliers, with division receiving
               less attention. The design space of FP dividers is large,
               comprising five different classes of division algorithms:
               digit recurrence, functional iteration, very high radix,
               table look-up, and variable latency. While division is an
               infrequent operation even in floating point intensive
               applications, it is shown that ignoring its implementation
               can result in system performance degradation. A high
               performance FPU requires a fast and efficient adder,
               multiplier, and divider.
               The design question becomes how to best implement the FPU in
               order to maximize performance given the constraints of
               silicon die area. The system performance and area impact of
               functional unit latency is examined for varying instruction
               issue rates in the context of the SPECfp92 application suite.
               Performance implications are investigated for shared
               multiplication hardware, shared square root, on-the-fly
               rounding and conversion and fused functional units. Due to
               the importance of low latency FP addition, a variable latency
               FP addition algorithm has been developed which improves
               average addition latency by 33% while maintaining
               single-cycle throughput. To improve the performance and area
               of linear converging division algorithms, an automated
               process is proposed for minimizing the complexity of SRT
               tables. To reduce the average latency of
               quadratically-converging division algorithms, the technique
               of reciprocal caching is proposed, along with a method to
               reduce the latency penalty for exact rounding. A combination
               of the proposed techniques provides a basis for future high
               performance floating point units.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/711/CSL-TR-96-711.pdf

%R CSL-TR-96-697
%Z Thu, 02 Jan 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Design of SMART: A Scheduler for Multimedia Applications
%A Nieh, Jason
%A Lam, Monica S.
%D June 1996
%X We have created SMART, a Scheduler for Multimedia And
               Real-Time applications. SMART supports both real-time and
               conventional computations and provides flexible and accurate
               control over the sharing of processor time. SMART is able to
               satisfy real-time constraints in an optimal manner and
               provide proportional sharing across all real-time and
               conventional tasks. Furthermore, when not all real-time
               constraints can be met, SMART satisfies each real-time task's
               proportional share of deadlines, and adjusts its execution
               rate dynamically. This technique is especially important for
               multimedia applications that can operate at different rates
               depending on the loading condition. This paper presents the
               design of SMART and provides measured performance results of
               its effectiveness based on a prototype implementation in the
               Solaris operating system.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/697/CSL-TR-96-697.pdf

%R CSL-TR-96-707
%Z Tue, 11 Feb 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Reducing Cache Miss Rates Using Prediction Caches
%A Bennett, James E.
%A Flynn, Michael J.
%D October 1996
%X Processor cycle times are currently much faster than memory
               cycle times, and the trend has been for this gap to increase
               over time. The problem of increasing memory latency, relative
               to processor speed, has been dealt with by adding high speed
               cache memory. However, it is difficult to make a cache both
               large and fast, so that cache misses are expected to continue
               to have a significant performance impact.
               Prediction caches use a history of recent cache misses to
               predict future misses, and to reduce the overall cache miss
               rate. This paper describes several prediction caches, and
               introduces a new kind of prediction cache, which combines the
               features of prefetching and victim caching. This new cache is
               shown to be more effective at reducing miss rate and
               improving performance than existing prediction caches.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/707/CSL-TR-96-707.pdf

%R CSL-TR-96-708
%Z Mon, 31 Mar 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Validation Tools for Complex Digital Designs
%A Ho, Chian-Min Richard
%D December 1996
%X The functional validation of a complex digital design is a
               laborious, ad-hoc and open-ended task. Many circuits are too
               complex to be formally verified in their entirety. Instead,
               simulation of a register transfer level (RTL) model is used.
               This research explores techniques to make the validation task
               more systematic, automated and efficient. This can be
               accomplished by using information embedded in the RTL model
               to extract the set of "interesting behaviors" of the design,
               represented as interacting finite state machines (FSM). If
               all such interesting behaviors of the RTL could be tested in
               simulation, the degree of confidence that the design is
               correct would be substantially higher. This work provides two
               tools towards this goal. First, a test vector generator is
               described that uses this information to produce a series of
               test vectors that exercise all the implemented behaviors of
               the design in RTL simulation. Secondly, the information can
               be used as the basis for coverage analysis of a pre-existing
               test vector suite. Previous coverage metrics, such as toggles
               on a node in the circuit or code block execution counts,
               often give good first order indications of how thorough a
               circuit has been exercised but do not usually give an
               accurate picture of whether multiple or concurrent events
               have been exercised. In this thesis, a new method is proposed
               of analyzing test vector suite coverage based on projecting a
               minimized control state graph onto control signals that enter
               the datapath part of the design.
               The fundamental problem facing any technique that uses state
               exploration is state space explosion. Two techniques are
               proposed to minimize this problem; first, a dynamic state
               graph pruning algorithm based on static analysis of the model
               structure to provide an exact minimization and second,
               approximation of the state graph with an estimation of the
               state space in a more compact representation. These
               techniques help delay the onset of state explosion, allowing
               useful information to be obtained and utilized, even for
               complex designs. Results and practical experiences of
               applying these techniques to the design of the node
               controller (MAGIC) of the Stanford FLASH Multiprocessor
               project are given.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/708/CSL-TR-96-708.pdf

%R CSL-TR-96-710
%Z Mon, 14 Apr 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Executable Formal Models of Distributed Transaction Systems
               Based on Event Processing
%A Kenney, John
%D November 1996
%X This dissertation presents formal models of distributed
               transaction processing (DTP) that are executable and
               testable. These models apply a new technology, Rapide, an
               object-oriented executable architecture description language
               designed for specifying and prototyping distributed,
               time-sensitive systems. This dissertation shows how the
               Rapide technology can be applied to specify, prototype, and
               test DTP models.
               In particular, this dissertation specifies a reference
               architecture for the X/Open DTP industry standard. The
               reference architecture, written in Rapide, defines
               architectures and behaviors of systems that comply with the
               X/Open standard. This dissertation also applies a technique
               developed previously by Gennart and Luckham for testing
               applications for conformance with reference architectures.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/96/710/CSL-TR-96-710.pdf

%R CSL-TR-97-717
%Z Mon, 07 Apr 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic Synthesis of Sequential Circuits for Low Power
               Dissipation
%A Benini, Luca
%D February 1997
%X In high-performance digital CMOS systems, excessive power
               dissipation reduces reliability and increases the cost
               imposed by cooling systems and packaging. Power is obviously
               the primary concern for portable applications, since
               battery technology cannot keep the fast pace imposed by
               Moore's Law, and there is large demand for devices with light
               batteries and long time between recharges.
               Computer-Aided Engineering is probably the only viable
               paradigm for designing state-of-the art VLSI and ULSI
               systems, because it allows the designer to focus on the
               high-level trade-offs and to concentrate the human effort on
               the most critical parts of the design.
               We present a framework for the computer-aided design of
               low-power digital circuits. We propose several techniques for
               automatic power reduction based on paradigms which are widely
               used by designers. Our main purpose is to provide the
               foundation for a new generation of CAD tools for power
               optimization under performance constraints. In the last
               decade, the automatic synthesis and optimization of digital
               circuits for minimum area and maximum performance has been
               extensively investigated. We leverage the knowledge base
               created by such research, but we acknowledge the distinctive
               characteristics of power as optimization target.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/717/CSL-TR-97-717.pdf

%R CSL-TR-97-713
%Z Thu, 17 Apr 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T From the Valley of Heart's Delight to Silicon Valley: A Study
               of Stanford University's Role in the Transformation
%A Tajnai, Carolyn
%D January 1997
%X This study examines the role of Stanford University in the
               transformation from the Valley of Heart's Delight to the
               Silicon Valley. At the dawn of the Twentieth Century,
               California's Santa Clara County was an agricultural paradise.
               Because of the benign climate and thousands of acres of fruit
               orchards, the area became known as the Valley of Heart's
               Delight.
               In the early 1890's, Leland and Jane Stanford donated land in
               the valley to build a university in memory of their son.
               Thus, Leland Stanford, Jr., University was founded.
               In the early 1930's, there were almost no jobs for young
               Stanford engineering graduates. This was about to change.
               Although there was no organized plan to help develop the
               economic base of the area around Stanford University, the
               concern about the lack of job opportunities for their
               graduates motivated Stanford faculty to begin the chain of
               events that led to the birth of Silicon Valley.
               Stanford University's role in the transformation of the
               Valley of Heart's Delight into Silicon Valley is history, but
               it is enduring history. Stanford continues to effect the
               local economy by spawning new and creative ideas, dreams, and
               ambitions.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/713/CSL-TR-97-713.pdf

%R CSL-TR-97-714
%Z Thu, 17 Apr 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Parallelizing Compiler Techniques Based on Linear
               Inequalities
%A Amarasinghe, Saman Prabhath
%D January 1997
%X Shared-memory multiprocessors, built out of the latest
               microprocessors, are becoming a widely available class of
               computationally powerful machines. These affordable
               multiprocessors can potentially deliver supercomputer-like
               performance to the general public.
               To effectively harness the power of these machines it is
               important to find all the available parallelism in programs.
               The Stanford SUIF interprocedural parallelizer we have
               developed is capable of detecting coarser granularity of
               parallelism in sequential scientific applications than
               previously possible. Specifically, it can parallelize loops
               that span numerous procedures and hundreds of lines of code,
               frequently requiring modifications to array data structures
               such as array privatization. Measurements from several
               standard benchmark suites demonstrate that aggressive
               interprocedural analyses can substantially advance the
               capability of automatic parallelization technology.
               However, locating parallelism is not sufficient in achieving
               high performance. It is critical to make effective use of the
               memory hierarchy. In parallel applications, false sharing and
               cache conflicts between processors can significantly reduce
               performance. We have developed the first compiler that
               automatically performs a full suite of data transformations
               (a combination of transposing, strip-mining and padding). The
               performance of many benchmarks improves drastically after
               the data transformations.
               We introduce a framework based on systems of linear
               inequalities for developing compiler algorithms. Many of the
               whole program analyses and aggressive optimizations in our
               compiler employ this framework. Using this framework general
               solutions to many compiler problems can be found
               systematically.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/714/CSL-TR-97-714.pdf

%R CSL-TR-97-719
%Z Tue, 15 Apr 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Automatic Computation and Data Decomposition for
               Multiprocessors
%A Anderson, Jennifer-Ann Monique
%D March 1997
%X Memory subsystem efficiency is critical to achieving high
               performance on parallel machines. The memory subsystem
               organization of modern multiprocessor architectures makes
               their performance highly sensitive to both the distribution
               of the computation and the layout of the data. A key issue in
               programming these machines is selecting the computation and
               data decomposition, the mapping of the computation and data,
               respectively, across the processors of the machine.
               A popular approach to the decomposition problem is to require
               programmers to perform the decomposition analysis themselves,
               and to communicate that information to the compiler using
               language extensions. This thesis presents a new compiler
               algorithm that automatically calculates computation and data
               decompositions for dense-matrix scientific codes. The core of
               the algorithm is based on a linear algebra framework for
               expressing and calculating decompositions. Since the best
               decompositions may change as different phases of the program
               are executed, the algorithm also considers re-organizing the
               data dynamically. The analysis is performed both within and
               across procedure boundaries so that entire programs can be
               analyzed.
               We evaluated the effectiveness of the algorithm by applying
               it to a suite of benchmark programs. We found that our
               decomposition analysis and optimization can lead to
               significant increases in program performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/719/CSL-TR-97-719.pdf

%R CSL-TR-97-720
%Z Thu, 17 Apr 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Simulation Study of IP Switching
%A Lin, Steven
%A McKeown, Nick
%D April 1997
%X Recently there has been much interest in combining the speed
               of layer-2 switching with the features of layer-3 routing.
               This has been prompted by numerous proposals, including: IP
               Switching, Tag Switching, ARIS, CSR, and IP over ATM. In this
               paper, we study IP Switching and evaluate the performance
               claims made by Newman et al. In particular, using nine
               network traces, we study how well IP Switching performs with
               traffic found in campus, corporate, and Internet Service
               Provider (ISP) environments. Our main finding is that IP
               Switching will lead to a high proportion of datagrams that
               are switched; over 75% in all of the environments we studied.
               We also investigate the effects that different flow
               classifiers and various timer values have on performance, and
               note that some choices can result in a large VC space
               requirement. Finally, we present recommendations for the flow
               classifier and timer values, as a function of the VC space of
               the switch and the network environment being served.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/720/CSL-TR-97-720.pdf

%R CSL-TR-97-723
%Z Tue, 22 Apr 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Hierarchical Storage Systems for Interactive Video-On-Demand
%A Chan, Shueng-Han Gary
%A Tobagi, Fouad A.
%D April 1997
%X On-demand video servers based on hierarchical storage systems
               are able to offer high-capacity and low-cost video storage.
               In such a system, video files are stored in the tertiary
               level and transferred to the secondary level to be displayed.
               Designing such servers allowing user interaction with the
               playbacked video is of great interest. We have conducted a
               comprehensive study on the architecture and operation of such
               a VOD server. Our objective is to understand its performance
               characteristics, so as to design a video server to meet
               specific application requirements. Applications of interest
               include distance-learning, movie-on-demand, interactive news,
               home-shopping, etc.
               The design of such a server actually involves many design
               choices pertaining to both architecture and operational
               procedures. We first study through simulation a baseline
               system which captures the essential performance
               characteristics of a hierarchical storage system. Then we
               extend our study beyond the baseline covering numerous other
               system variations in terms of architectural parameters and
               operational procedures. We have also examined various
               applications characteristics, such as file size and video
               popularity, on system performance. We demonstrate the
               usefulness of our results by applying them to the design of a
               video server taking into account current storage
               technologies.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/723/CSL-TR-97-723.pdf

%R CSL-TR-97-718
%Z Wed, 17 Feb 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Fault Tolerance: Methods of Rollback Recovery
%A Sunada, Dwight
%A Glasco, David
%A Flynn, Michael
%D March 1997
%X This paper describes the latest methods of rollback recovery
               for fault-tolerant distributed shared memory (DSM)
               multiprocessors. This report discusses (1) the theoretical
               issues that rollback recovery addresses, (2) the 3 major
               classes of methods for recovery, and (3) the relative merits
               of each class.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/718/CSL-TR-97-718.pdf

%R CSL-TR-97-724
%Z Thu, 19 Jun 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T State Reduction Methods for Automatic Formal Verification
%A Ip, C. Norris
%D December 1996
%X Validation of industrial designs is becoming more challenging
               as technology advances. One of the most suitable debugging
               aids is automatic formal verification. This thesis presents
               several techniques for reducing the state explosion problem,
               that is, reducing the number of states that are examined.
               A major contribution of this thesis is the design of simple
               extensions to the Murphi description language, which enable
               us to convert two existing abstraction strategies into two
               fully automatic algorithms, making these strategies easy to
               use and safe to apply. These two algorithms rely on two facts
               about high-level designs: they frequently exhibit structural
               symmetry, and their behavior is often independent of the
               exact number of replicated components they contain.
               Another contribution is the design of a new state reduction
               algorithm, which relies on reversible rules (transitions that
               do not lose information) in a system description. This new
               reduction algorithm can be used simultaneously with the other
               two algorithms.
               These techniques, implemented in the Murphi verification
               system, have been applied to many applications, such as cache
               coherence protocols and distributed algorithms. In the cases
               of two important classes of infinite systems, infinite state
               graphs can be automatically converted to small finite state
               graphs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/724/CSL-TR-97-724.pdf

%R CSL-TR-97-712
%Z Wed, 09 Jul 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Hive: Operating System Fault Containment for Shared-Memory
                Multiprocessors
%A Chapin, John
%D July 1997
%X Reliability and scalability are major concerns when designing
                general-purpose operating systems for large-scale
                shared-memory multiprocessors. This dissertation describes
                Hive, an operating system with a novel kernel architecture
                that addresses these issues. Hive is structured as an
                internal distributed system of independent kernels called
                cells. This architecture improves reliability because a
                hardware or software error damages only one cell rather than
                the whole system. The architecture improves scalability
                because few kernel resources are shared by processes running
                on different cells. The Hive prototype is a complete
                implementation of UNIX SVR4 and is targeted to run on the
                Stanford FLASH multiprocessor. 
                The research described in the dissertation makes three
                primary contributions: (1) it demonstrates that distributed
                system mechanisms can be used to provide fault containment
                inside a shared- memory multiprocessor; (2) it provides a
                specification for a set of hardware features, implemented in
                the Stanford FLASH, that are sufficient to support fault
                containment; and (3) it demonstrates how to take advantage of
                shared-memory hardware across cell boundaries at both
                application and kernel levels while preserving fault
                containment. The dissertation also analyzes the architectural
                and performance tradeoffs of multicellular kernels. 
                Fault injection experiments conducted using the SimOS machine
                simulator demonstrate the reliability of the Hive prototype.
                Studies using both general-purpose and scientific workloads
                illustrate the performance tradeoffs of the multicellular
                kernel architecture.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/712/CSL-TR-97-712.pdf

%R CSL-TR-97-730
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T Performance Isolation and Resource Sharing on Shared-Memory
                Multiprocessors
%A Verghese, Ben
%A Gupta, Anoop
%A Rosenblum, Mendel
%D July 1997
%X Shared-memory multiprocessors are attractive as
                general-purpose compute servers. On the software side, they
                present programmers with the same programming paradigm as
                uniprocessors, and they can run unmodified uniprocessor
                binaries. On the hardware side, the tight coupling of
                multiple processors, memory, and I/O enables efficient
                fine-grain sharing of resources on these systems. This
                fine-grain sharing is important in compute servers because it
                allows idle resources to be easily utilized by active jobs
                leading to better system throughput. However, current SMP
                operating systems do not provide an important feature that
                users of workstations enjoy, namely the lack of interference
                from the jobs of unrelated users. We show that this lack of
                isolation is caused by the resource 
                allocation model carried over from single-user workstations,
                which is inappropriate for multi-user multiprocessor systems.
                We propose "performance isolation", a new resource allocation
                model for multi-user multiprocessor compute servers. This
                model allows the isolation of the performance of groups of
                processes from the load on the rest of the system, provides
                performance comparable to a smaller system that corresponds
                to the resources used, and allows the sharing of idle
                resources for throughput comparable to a SMP OS. We implement
                the performance isolation model in the IRIX5.3 operating
                system for three important system resources: CPU time,
                memory, and disk bandwidth. Our implementation of fairness
                for disk bandwidth is novel. Running a number of workloads we
                show that this model is very successful at providing
                workstation-like latencies under heavy load and SMP-like
                latencies under light load.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/730/CSL-TR-97-730.pdf

%R CSL-TR-97-729
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T Remote Memory Access in Workstation Clusters
%A Verghese, Ben
%A Rosenblum, Mendel
%D July 1997
%X Efficient sharing of memory resources in a cluster of
                workstations has the promise of greatly improving the
                performance and cost-effectiveness of the cluster when
                running large memory- intensive jobs. A point of interest is
                the hardware support required for good memory sharing
                performance. We evaluate the performance of two models: the
                software-only model that runs on a traditional distributed
                system configuration, and requires support from the operating
                system to access remote memory; and the hardware-intensive
                model that uses a specialized network interface to extend the
                memory system to allow direct access to remote memory. Using
                SimOS, we do a fair comparison of the performance of the two
                memory-sharing models for a set of interesting compute-server
                workloads. We find that the software-only model, with current
                remote page-fault latencies, does not provide acceptable
                memory-sharing performance. The hardware shared-memory system
                is able to provide stable performance across a range of
                latencies. If the remote page-fault latency can be reduced to
                100 microseconds, the performance of the software- only model
                becomes acceptable for many, though not all, workloads.
                Considering the interconnection bandwidth required to sustain
                the software-only page-level memory sharing, our experiments
                show that a gigabit network is necessary for good
                performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/729/CSL-TR-97-729.pdf

%R CSL-TR-97-725
%Z Mon, 09 Oct 00 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Designing reliable programs with RAPIDE
%A Madhav, Neel
%A Luckham, David C.
%D April 1997
%X Rapide is a language for prototyping large, distributed
                 systems. Rapide allows the scale design of a system to be
                 constructed and analyzed before resources are applied to the
                 construction of the actual system.  
                 Two important facets of designing reliable systems are (1)
                 system architecture -- the components in the system and the
                 communication paths between the componnts, and (2) system
                 behavior -- the requirements on the components and the
                 communication. Rapide facilitates the design of system
                 architecture and behavior by (1) providing language features
                 to realize system designs, (2) providing an expressive model
                 for capturing the execution behavior of systems, and (3)
                 providing techniques and tools for analyzing system execution
                 behavior.
                 This paper introduces the essential concepts of Rapide and
                 gives an example of system design using Rapide. 
                 Rapide has 4 sublanguages -- (1) a type language, (2) an
                 architecture definition language, (3) a constraint language
                 and (4) an executable language. The paper introduces the
                 Rapide architecture sublanguage and the Rapide constraint
                 sublanguage.
                 The Rapide model of system execution is a set of significant
                 events partially ordered by causality (also called posets).
                 This paper discusses Rapide execution models and compares
                 them with totally ordered event based models.
                 Rapide provides tools to check constraints on posets to
                 browse posets and to animate events on a system architecture.
                 This paper briefly discusses the Rapide analysis tools.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/725/CSL-TR-97-725.pdf

%R CSL-TR-97-715
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T Software and Hardware for Exploiting Speculative Parallelism
               with a Multiprocessor
%A Oplinger, Jeffrey
%A Heine, David
%A Liao, Shih-Wei
%A Nayfeh, Basem A.
%A Lam, Monica S.
%A Olukotun, Kunle
%D February 1997
%X Thread-level speculation (TLS) makes it possible to
               parallelize general purpose C programs. This paper proposes
               software and hardware mechanisms that support speculative
               thread- level execution on a single-chip multiprocessor. A
               detailed analysis of programs using the TLS execution model
               shows a bound on the performance of a TLS machine that is
               promising. In particular, TLS makes it feasible to find
               speculative do across parallelism in outer loops that can
               greatly improve the performance of general-purpose
               applications. Exploiting speculative thread-level parallelism
               on a multiprocessor requires the compiler to determine where
               to speculate, and to generate SPMD (single program multiple
               data) code.We have developed a fully automatic compiler
               system that uses profile information to determine the best
               loops to execute speculatively, and to generate the
               synchronization code that improves the performance of
               speculative execution. The hardware mechanisms required to
               support speculation are simple extensions to the cache
               hierarchy of a single chip multiprocessor. We show that with
               our proposed mechanisms, thread-level speculation provides
               significant performance benefits.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/715/CSL-TR-97-715.pdf

%R CSL-TR-97-728
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T Defining a Security Reference Architecture
%A Meldal, Sigurd
%A Luckham, David
%D June 1997
%X This report discusses the definition and modeling of
                reference architectures that specify the security aspects of
                distributed systems. NSA's MISSI (Multilevel Information
                System Security Initiative) security reference architecture
                is used as an illustrative example. We show how one would
                define such a reference architecture, and how one could use
                such a definition to model as well as check implementations
                for compliance with the reference. 
                We demonstrate that an ADL should have not only the
                capability to specify interfaces, connections and operational
                constraints, but also to specify how it is related to other
                architectures or to implementations.
                A reference architecture such as MISSI is defined in Rapide
                [10] as a set of hierarchical interface connection
                architectures [9]. Each Rapide interface connection
                architecture is a reference architecture - an abstract
                architecture that allows a number of different
                implementations, but which enforces common structure and
                communication rules. The hierarchical reference architecture
                defines the MISSI policies at different levels - at the level
                of enclaves communicating through a network, at the level of
                each enclave being a local area network with firewalls and
                workstations and at the level of the individual workstations.
                The reference architecture defines standard components,
                communication patterns and policies common to MISSI compliant
                networks of computer systems. A network of computers may be
                checked for conformance against the reference architecture. 
                The report also shows how one can generate architecture
                scenarios of networks of communicating computers. The
                scenarios are constructed as Rapide executable models, and
                the behaviors of the models can be checked for conformance
                with the reference architecture in these scenarios. The
                executable models demonstrate how the structure and security
                policies in the reference architecture may apply to networks
                of computers.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/728/CSL-TR-97-728.pdf

%R CSL-TR-97-735
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T Flexible Connectivity Management for Mobile Hosts
%A Z hao, Xinhua
%A Baker, Mary G.
%D September 1997
%X Powerful light-weight portable computers, the availability of
               wireless networks, and the popularity of the Internet are
               driving the need for better networking support for mobile
               hosts. Users should be able to connect their portable
               computers to the Internet at any time and in any place, but
               the dynamic nature of such connectivity requires more
               flexible network management than has typically been available
               for stationary workstations.
               This report proposes techniques to address a unique feature
               of connectivity management on mobile hosts: its multiplicity,
               i.e. the need to support multiple packet delivery methods
               simultaneously and to support the use of multiple network
               devices for both availability and efficiency reasons. We have
               developed a set of techniques in the context of mobile IP for
               flexible, automatic network connectivity management for
               mobile hosts. We augment the routing layer of the network
               protocol stack with a Mobile Policy Table (MPT) to support
               multiple packet delivery mechanisms for different
               simultaneous flows based on the nature of the traffic. We
               also devise a set of mechanisms, including a
               backwards-compatible extension to the routing table, to
               facilitate the use of multiple network devices. We include
               performance results showing some of the potential benefits
               such increased flexibility provides for mobile hosts.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/735/CSL-TR-97-735.pdf

%R CSL-TR-97-731
%Z Tue, 30 Sep 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Single Chip Multiprocessor Integrated with High Density
               DRAM
%A Yamauchi, Tadaaki
%A Hammond, Lance
%A Olukotun, and Kunle
%D August 1997
%X A microprocessor integrated with DRAM on the same die has the
               potential to improve system performance by reducing memory
               latency and improving memory bandwidth. In this paper we
               evaluate the performance of a single chip multiprocessor
               integrated with DRAM when the DRAM is organized as on-chip
               main memory and as on-chip cache. We compare the performance
               of this architecture with that of a more conventional chip
               which only has SRAM-based on-chip cache. The DRAM-based
               architecture with four processors outperforms the SRAM-based
               architecture on floating point applications which are
               effectively parallelized and have large working sets.This
               performance difference is significantly better than that
               possible in a uniprocessor DRAM-based architecture, which
               performs only slightly faster than an SRAM-based architecture
               on the same applications. In addition, on multiprogrammed
               workloads, in which independent processes are assigned to
               every processor in a single chip multiprocessor,the large
               bandwidth of on-chip DRAM can handle the inter-access
               contention better. These results demonstrate that a
               multiprocessor takes better advantage of the large bandwidt
               provided by the on-chip DRAM than a uniprocessor.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/731/CSL-TR-97-731.pdf

%R CSL-TR-97-726
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T LOW-POWER PROCESSOR DESIGN
%A Gonzalez, Ricardo E.
%D June 1997
%X Power has become an important aspect in the design of general
               purpose processors. This thesis explores how design tradeoffs
               affect the power and performance of the processor. Scaling
               the technology is an attractive way to improve the energy
               efficiency of the processor. In a scaled technology a
               processor would dissipate less power for the same performance
               or higher performance for the same power. Some
               micro-architectural changes, such as pipelining and caching,
               can significantly improve efficiency. Unfortunately many
               other architectural tradeoffs leave efficiency unchanged.
               This is because a large fraction of the energy is dissipated
               in essential functions and is unaffected by the internal
               organization of the processor.
               Another attractive technique for reducing power dissipation
               is scaling the supply and threshold voltages. Unfortunately
               this makes the processor more sensitive to variations in
               process and operating conditions. Design margins must
               increase to guarantee operation, which reduces the efficiency
               of the processor. One way to shrink these design margins is
               to use feedback control to regulate the supply and threshold
               voltages thus reducing the design margins. Adaptive
               techniques can also be used to dynamically trade excess
               performance for lower power. This results in lower average
               power and therefore longer battery life. Improvements are
               limited, however, by the energy dissipation of the rest of
               the system.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/726/CSL-TR-97-726.pdf

%R CSL-TR-97-737
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T Stochastic Congestion Model for VLSI Systems
%A Hung, Patrick
%A Flynn, Michael J.
%D October 1997
%X Designing with deep submicron feature size presents new
               challenges in complexity, performance, and productivity.
               Information on routing congestion and interconnect area are
               critical in the pre-RTL stage in order to forecast the whole
               die size, define the timing specifications, and evaluate the
               chip power consumption.
               In this report, we propose a stochastic model for VLSI
               interconnect routing, which can be used to estimate the
               routing congestion and the interconnect area in the pre-RTL
               stage. First, we define the uniform and geometric routing
               distributions, and introduce a simple and efficient algorithm
               to calculate the routing probabilities. We then derive the
               routing probabilities among multiple functional blocks, and
               investigate the effects of routing obstacles. Finally, we map
               the chip to a Cartesian coordinate system, and model
               routability based on the supply and demand distributions of
               routing channels.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/737/CSL-TR-97-737.pdf

%R CSL-TR-97-727
%Z Mon, 09 Oct 00 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Towards an Abstraction Hierarchy for CAETI Architectures, and
                Possible Applications
%A Luckham, David
%A Vera, James
%A Belz, Frank
%D April 1997
%X This document proposes a four level abstraction hierarchy for
                CAETI systems architectures for review and discussion by the
                CAETI community. Some possible applications are described
                briefly.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/727/CSL-TR-97-727.pdf

%R CSL-TR-97-732
%Z Thu, 20 Nov 97 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Efficient Exception Handling Techniques for High-Performance
               Processor Architectures
%A Rudd, Kevin W.
%D October 1997
%X Providing precise exceptions has driven much of the
               complexity in modern processor designs. While this complexity
               is required to maintain the illusion of a processor based on
               a sequential architectural model, it also results in reduced
               performance during normal execution. The existing notion of
               precise exceptions is limited to processors based on a
               sequential architectural model and there have been few
               techniques developed that are applicable to processors that
               are not based on this model. Processors with exposed
               pipelines (typical of VLIW processors) do not conform to the
               sequential execution model. These processors have explicit
               overlaps in operation execution and thus cannot support the
               traditional notion of precise exceptions; most exception
               handling techniques for these processors require restrictive
               software scheduling. In this report, we generalize the notion
               of a precise exception and extend the applicability of
               precise exceptions to a wider range of architectures. We
               propose precise exception handling techniques that solve the
               problem of efficient exception handling for both sequential
               architectures as well as exposed pipeline architectures. We
               also show how these techniques can provide efficient support
               for speculative execution past multiple branches for both
               architectures as well as latency tolerance for exposed
               pipeline architectures.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/732/CSL-TR-97-732.pdf

%R CSL-TR-97-738
%Z Mon, 12 Jan 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T On the Speedup Required for Combined Input and Output Queued
               Switching
%A Prabhakar, Balaji
%A McKeown, Nick
%D November 1997
%X Architectures based on a non-blocking fabric, such as a
               crosspoint switch, are attractive for use in high-speed LAN
               switches, ATM switches and IP routers. These fabrics, coupled
               with memory bandwidth limitations, dictate that queues be
               placed at the input of the switch. But it is well known that
               input-queueing can lead to low throughput, and does not allow
               the control of latency through the switch. This is in
               contrast to output-queueing, which maximizes throughput, and
               permits the accurate control of packet latency through
               scheduling. We ask the question: Can a switch with combined
               input and output queueing be designed to behave identically
               to an output-queued switch? In this paper, we prove that if
               the switch uses virtual output queueing, and has an internal
               speedup of just four, it is possible for it to behave
               identically to an output queued switch, regardless of the
               nature of the arriving traffic. Our proof is based on a novel
               scheduling algorithm, known as Most Urgent Cell First. This
               result makes possible switches that perform as if they were
               output-queued, yet use memories that run more slowly.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/738/CSL-TR-97-738.pdf

%R CSL-TR-97-734
%Z Mon, 12 Jan 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Packet Switching Photonic Network Switch Design and Routing
               Algorithm
%A Lee, Hyuk-Jun
%A Morf, Martin
%A Flynn, Michael
%D December 1997
%X Maturity of photonic technology makes it possible to
               construct all optical network switch to avoid
               optical-to-electrical signal conversion for routing. To
               realize all optical packet switching, our current network
               topology and routing algorithms have to be reexamined and
               modified to satisfy the necessities of all optical network
               switching such as a fast routing decision, consideration of
               hardware implementation, buffering etc.
               In this paper, first, we will review various switching
               architectures including crossbar, Benes and Batcher/Banyan.
               Secondly, optical implementation of a multiple output port
               network switch will be presented. In many levels of
               networking from multiprocessor interconnection to wide area
               networking, multiple latencies resulting from this scheme
               could improve the overall performance when combined with
               smart routing schemes. Finally, we present a interpretation
               of multistage network using a symmetric group. A Cayley graph
               for a symmetric group and its coset graphs suggest an
               interesting alternative way to construct a new multistage
               interconnection network.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/734/CSL-TR-97-734.pdf

%R CSL-TR-97-748
%Z Wed, 21 Jan 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Decision Diagrams and Pass Transistor Logic Synthesis
%A Bertacco, V.
%A Minato, S.
%A Verplaetse, P.
%A Benini, L.
%A Micheli, and G. De
%D December 1997
%X Since the relative importance of interconnections increases
               as feature size decreases, standard-cell based synthesis
               becomes less effective when deep-submicron technologies
               become available. Intra-cell connectivity can be decreased by
               the use of macro-cells. In this work we present methods for
               the automatic generation of macro-cells using pass
               transistors and domino logic. The synthesis of these cells is
               based on BDD and Z BDD representations of the logic functions.
               We address specific problems associated with the BDD approach
               (level degradation, long paths) and the Z BDD approach (sneak
               paths, charge sharing, long paths). We compare performance of
               the macro-cells approach versus the conventional
               standard-cell approach based on accurate electrical
               simulation. This shows that the macro-cells perform well up
               to a certain complexity of the logic function. Functions of
               high complexity must be decomposed into smaller logic blocks
               that can directly be mapped to macro-cells.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/748/CSL-TR-97-748.pdf

%R CSL-TR-97-739
%Z Wed, 18 Feb 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Hardware/Software Co-Design of Run-Time Schedulers for
               Real-Time Systems
%A Mooney, Vincent John III 
%A Micheli, Giovanni De
%D November 1997
%X We present the SERRA Run-Time Scheduler Synthesis and
               Analysis Tool which automatically generates a run-time
               scheduler from a heterogeneous system-level specification in
               both Verilog HDL and C. Part of the run-time scheduler is
               implemented in hardware, which allows the scheduler to be
               predictable in being able to meet hard real-time constraints,
               while part is implemented in software, thus supporting
               features typical of software schedulers.
               SERRA's real-time analysis generates a priority assignment
               for the software tasks in the mixed hardware-software system.
               The tasks in hardware and software have precedence
               constraints, resource constraints, relative timing
               constraints, and a rate constraint. A heuristic scheduling
               algorithm assigns the static priorities such that a hard
               real-time rate constraint can be predictably met. SERRA
               supports the specification of critical regions in software,
               thus providing the same functionality as semaphores.
               We describe the task control/data-flow extraction, synthesis
               of the control portion of the run-time scheduler in hardware,
               real-time analysis and priority scheduler template. We also
               show how our approach fits into an overall tool flow and
               target architecture. Finally, we conclude with a sample
               application of the novel run-time scheduler synthesis and
               analysis tool to a robotics design example.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/739/CSL-TR-97-739.pdf

%R CSL-TR-97-745
%Z Tue, 03 Aug 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Selection of Recent Advances in Computer Systems 
%A Mencer, Oskar
%A Flynn, Michael
%D July 1999 
%X This paper presents a selection of recent research results in
               computer systems. The roadmap for CMOS technology for the
               next ten years shows a theoretical limit of 0.1 um for the
               channel of a MOSFET transistor, reached by 2007. Mainstream
               processors are adapting to multimedia applications with
               subword parallel instructions like Intel's MMX or HP's MAX
               instruction set extensions. Coprocessors and embedded
               processors are moving towards VLIW in order to save hardware
               costs. The memory system of the future is going to be the
               next generation of Rambus/RDRAM. Finally, Custom Computing
               Machines based on Field Programmable Gate Arrays are on of
               the promising future technologies for computing -- offering
               very high performance for highly parallelizable and
               pipelinable applications.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/745/CSL-TR-97-745.pdf

%R CSL-TR-97-744
%Z Tue, 03 Mar 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The FLASH Multiprocessor: Designing a Flexible and Scalable
               System
%A Kuskin, Jeffrey Scott
%D November 1997
%X The choice of a communication paradigm, or protocol, is
               central to the design of a large-scale multiprocessor system.
               Unlike traditional multiprocessors, the FLASH machine uses a
               programmable node controller, called MAGIC, to implement all
               protocol processing. The architecture of the MAGIC chip
               allows FLASH to support multiple communication paradigms - in
               particular, cache-coherent shared memory and high-performance
               message passing - while minimizing both hardware and software
               overhead. Each node in FLASH contains a microprocessor, a
               portion of the machine's global memory, a port to the
               interconnection network, an I/O interface, and MAGIC, the
               custom node controller. The MAGIC chip handles all
               communication both within the node and among nodes, using
               hardwired data paths for efficient data movement and a
               programmable processor optimized for executing protocol
               operations. The result is a system that is flexible and
               scalable, yet competitive in performance with a traditional
               multiprocessor that implements a single communication
               paradigm completely in hardware.
               The focus of this dissertation is the architecture, design,
               and performance of FLASH. Much of the motivation behind the
               FLASH system and the MAGIC node controller design stems from
               an examination of the characteristics of protocol code and
               the architecture of the DASH system, the predecessor to
               FLASH. This examination led to two major design goals:
               development of a node controller architecture that can attain
               high protocol processing performance while still maintaining
               flexibility and a need to reduce the logic and memory
               overheads associated with cache coherence. The MAGIC design
               achieves these goals by implementing on a single chip a
               programmable protocol engine with an instruction set
               optimized for the characteristics of protocol code, along
               with dedicated support logic to alleviate the most serious
               protocol processing performance bottlenecks - data movement,
               message dispatch, and lack of close coupling to the node
               board components. The design of the FLASH node complements
               the MAGIC design, matching the close coupling and high
               bandwidth support in MAGIC to provide a balanced node
               architecture.
               Next, the dissertation investigates the performance of
               cache-coherence on FLASH. Performance results are presented
               from microbenchmarks run on the Verilog RTL of the MAGIC chip
               and from complete applications run on FlashLite, the FLASH
               system-level simulator. The microbenchmarks demonstrate that
               the architectural extensions added to the MAGIC design -
               particularly the instruction set optimizations to the
               programmable protocol processor - yield significantly lower
               latencies and protocol processor occupancies to service the
               most common types of memory operations.
               The application results are used to evaluate the performance
               costs of flexibility by comparing the performance of FLASH to
               that of a hardwired machine on representative parallel
               applications and multiprogramming workloads. These results
               show that poor application memory reference or load balancing
               characteristics cause the performance of the FLASH system to
               degrade more rapidly than the performance of the hardwired
               system; that is, FLASH's performance is less robust. For
               applications that incur a large number of remote misses or
               exhibit substantial hot-spotting, the increased remote access
               latencies or the occupancy of MAGIC lead to lower performance
               for the flexible design.
               Overall, however, the performance of FLASH can be competitive
               with the performance of the hardwired machine. Specifically,
               for a range of optimized parallel applications, the
               performance differences between the hardwired machine and
               FLASH are small, typically less than 10% at 32 processors and
               less than 15% at 64 processors. For these programs, either
               the processor cache miss rates are small or the latency of
               the programmable protocol processing can be hidden behind the
               memory access time.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/744/CSL-TR-97-744.pdf

%R CSL-TR-97-733
%Z Thu, 05 Mar 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T New Methods for Surface Reconstruction from Range Images
%A Curless, Brian Lee
%D June 1997
%X The digitization and reconstruction of 3D shapes has numerous
               applications in areas that include manufacturing, virtual
               simulation, science, medicine, and consumer marketing. In
               this thesis, we address the problem of acquiring accurate
               range data through optical triangulation, and we present a
               method for reconstructing surfaces from sets of data known as
               range images.
               The standard methods for extracting range data from optical
               triangulation scanners are accurate only for planar objects
               of uniform reflectance. Using these methods, curved surfaces,
               discontinuous surfaces, and surfaces of varying reflectance
               cause systematic distortions of the range data. We present a
               new ranging method based on analysis of the time evolution of
               the structured light reflections. Using this spacetime
               analysis, we can correct for each of these artifacts, thereby
               attaining significantly higher accuracy using existing
               technology. When using coherent illumination such as lasers,
               however, we show that laser speckle places a fundamental
               limit on accuracy for both traditional and spacetime
               triangulation.
               The range data acquired by 3D digitizers such as optical
               triangulation scanners commonly consists of depths sampled on
               a regular grid, a sample set known as a range image. A number
               of techniques have been developed for reconstructing surfaces
               by integrating groups of aligned range images. A desirable
               set of properties for such algorithms includes: incremental
               updating, representation of directional uncertainty, the
               ability to fill gaps in the reconstruction, and robustness in
               the presence of outliers and distortions. Prior algorithms
               possess subsets of these properties. In this thesis, we
               present an efficient volumetric method for merging range
               images that possesses all of these properties. Using this
               method, we are able to merge a large number of range images
               (as many as 70) yielding seamless, high-detail models of up
               to 2.6 million triangles.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/733/CSL-TR-97-733.pdf

%R CSL-TR-97-716
%Z Tue, 10 Mar 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Checking Experiments for Scan Chain Lathes and Flip-FLops
%A Makar, Samy
%D August 1997
%X New digital designs often include scan chains; high quality
               economical test is the reason. A scan chain allows easy
               access to internal combinational logic by converting bistable
               elements, latches and flip-flops, into a shift register. Test
               patterns are scanned in, applied to the internal circuitry,
               and the results are scanned out for comparison. While many
               techniques exist for testing the combinational circuitry,
               little attention has been paid to testing the bistable
               elements themselves. The bistable elements are typically
               tested by shifting in a sequence of zeroes and ones. This
               test can miss many defects inside the bistable elements. A
               checking experiment is a sequence of inputs and outputs that
               contains enough information to extract the functionality of
               the circuit. A new approach, based on such sequences, can
               significantly reduce the number of defects missed. Simulation
               results show that as many as 20 percent of the faults in
               bistable elements can be missed by typical tests; essentially
               all of these missed faults are detected by checking
               experiments. Since the checking experiment is a functional
               test, it is independent of the implementation of the bistable
               element. This is especially useful since designers often use
               different implementations of bistable elements to optimize
               their circuits for area and performance. Another benefit of a
               functional test is that it avoids the need for generating
               test patterns at the transistor level. Applying a complete
               checking experiment to a bistable element embedded inside a
               circuit can be very difficult, if not impossible. The new
               approach breaks up the checking experiment into a set of
               small sub-sequences. For each of these sub-sequences a test
               pattern is generated. These test patterns are scanned in, as
               in the case of the tests for combinational logic, appropriate
               changes to the control inputs of the bistable elements are
               applied, and the results are scanned out. The process of
               generating the patterns is automated by modifying an existing
               stuck-at test generator. A designer or test engineer need
               only provide a gate level description of the circuit to
               generate tests that guarantee a checking experiment for each
               bistable element in the design. Test size is an important
               economic factor in circuit design. The size of the
               checking-experiment-based test increases with circuit size at
               about the same rate as the traditional test, indicating that
               it is practical for large circuits. Checking-experiment-based
               tests are an effective economic means for testing the
               bistable elements in scan chain designs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/97/716/CSL-TR-97-716.pdf

%R CSL-TR-98-751
%Z Wed, 18 Feb 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Vitis Propulsion: Theory and Practice
%A Baker, Mary 
%A Honig, Sue 
%A Kercheval, Berry
%A Seltzer, Margo 
%D February 1998
%X We have proof that red grapes scoot around more than green
               grapes when microwaved.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/751/CSL-TR-98-751.pdf

%R CSL-TR-98-749
%Z Mon, 09 Mar 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Considerations in the Design of Hydra: A
               Multiprocessor-on-a-Chip Microarchitecture
%A Hammond, Lance
%A Olukotun, Kunle
%D February 1998
%X As more transistors are integrated onto larger dies,
               single-chip multiprocessors integrated with large amounts of
               cache memory will soon become a feasible alternative to the
               large, monolithic uniprocessors that dominate today's
               microprocessor marketplace. Hydra offers a promising way to
               build a small-scale MP-on-a-chip using a fairly simple design
               that still maintains excellent performance on a wide variety
               of applications. This report examines key parts of the Hydra
               design -- the memory hierarchy, the on-chip buses, and the
               control and arbitration mechanisms -- and explains the
               rationale for some of the decisions made in the course of
               finalizing the design of this memory system, with particular
               emphasis given to applications that stress the memory system
               with numerous memory accesses. With the balance between
               complexity and performance that we obtain, we feel Hydra
               offers a promising model for future MP-on-a-chip designs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/749/CSL-TR-98-749.pdf

%R CSL-TR-98-758
%Z Fri, 26 Feb 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Matching Output Queueing with a Combined Input Output Queued Switch
%A Chuang, Shang-Tse
%A Goel, Ashish
%A McKeown, Nick
%A Prabhakar, Balaji
%D April 1998
%X The Internet is facing two problems simultaneously: we need a faster
		switching/routing infrastructure, and we need to introduce 
guaranteed 
		qualities of service (QoS).  As a community, we have solutions to 
each: we
		can make the routers faster by using input-queued crossbard, instead 
of shared
		memory systems; and we can introduce QoS using WFQ-based packet 
scheduling.  
		But we don't know how to do both at the same time.  Until now, the 
two
		solutions have been mutually exclusive - all of the work on WFQ-
based
		scheduling algorithms has required that switches/routers use output-
queueing,
		or centralized shared memory.  We demonstrate that a Combined Input 
Output
		Queueing (CIOQ) switch running twice as fast as an input-queued 
switch can
		provide precise emulation of a broad class of packet scheduling 
algorithms,
		including WFQ and strict priorities.  More precisely, we show that a 
"speedup"
		of 2 - 1/N is both necessary and sufficient for this precise 
emulation.  We 
		introduce a variety of algorithms that configure the crossbar so 
that 
		emulation is achieved with a speedup of two, and consider their 
running
	 	time and implementation complexity.  We believe that, in the future, 
these
		results will make possible the support of QoS in very high bandwidth 
routers. 
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/758/CSL-TR-98-758.pdf

%R CSL-TR-98-753
%Z Fri, 26 Feb 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Resource Management Issues for Shared-Memory Multiprocessors
%A Verghese, Ben
%D March 1998 
%X Shared-memory multiprocessors (SMPs) are attractive as general-purpose 
		compute servers.  On the software side, they present the same 
programming 
		paradigm as uniprocessors, and they can run unmodified uniprocessor
		binaries.  On the hardware side, the tight coupling of multiple
		processors, memory, and I/O provides enormous computing power in a 
single system,
		and enables the efficient sharing of these resources.  As a compute 
server, this
		power can be exploited both by a collection of uniprocessor programs 
and
		by explicitly or automatically parallelized applications.  This 
thesis
		addresses two important performance-related issues encountered in 
such
		systems, performance isolation and data locality.  The solutions 
presented 
		in this dissertation address these issues through careful resource
		management in the operating system.
		Current shared-memory multiprocessor operating systems provide 
		very few controls for sharing the resources of the system among
		the active tasks or users.  This is a serious limitation for a 
		compute server that is to be used for multiple tasks or by 
		multiple users.  The current unconstrained sharing scheme allows
		the load placed by one user or task to adversely affect the
		performance seen by another.  We show that this lack of 
		isolation is caused by the resource allocation scheme (or lack
		thereof) carried over from single-user workstations.  Multi-user
		multiprocessor systems require more sophisticated resource
		management, and we propose "performance isolation", a new
		resource management scheme for such systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/753/CSL-TR-98-753.pdf

%R CSL-TR-98-759
%Z Thu, 23 Apr 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Optimized Multiprocessor Communication and Synchronization
               Using a Programmable Protocol Engine
%A Heinlein, John
%D March 1998
%X In recent years, multiprocessor designs have converged
               towards a unified hardware architecture despite supporting
               different communication abstractions. The implementation of
               these communication abstractions and the associated protocols
               in hardware is complex, inflexible, and error prone. For
               these reasons, some recent designs have employed a
               programmable controller to manage system communication. One
               particular focus of these designs is implementing cache
               coherence protocols in software. This dissertation argues
               that a programmable communication controller that provides
               cache coherence can also effectively support block transfer
               and synchronization protocols. This research is part of the
               FLASH project, a major focus of which is exploring the
               integration of multiple communication protocols in a single
               multiprocessor architecture.
               In our analysis, we examine the needs of protocols other than
               cache coherence to identify the requirements they share. The
               interface between the processor and controller is one
               critical issue in these protocols, so we propose techniques
               to export such protocols reliably, at low overhead, and
               without system calls. Unlike most prior studies, our approach
               supports a modern operating system with features like
               multiprogramming, protection, and virtual memory.
               Our study focuses in detail on two classes of communication
               that are important for large scale multiprocessors: block
               transfer and synchronization using locks and barriers. In
               particular, we attempt to improve the performance of these
               classes of communication as compared to implementations using
               only software on top of shared memory. For each protocol we
               identify the critical metrics of performance, explore the
               limitations of existing techniques, then present our
               implementation, which is tailored to leverage the
               programmable communication controller. We evaluate each
               protocol in isolation, in the context of microbenchmarks, and
               within a variety of applications.
               We find that embedding advanced communication and
               synchronization features in a programmable controller has a
               number of advantages. For example, the block transfer
               protocol improves transfer performance in some cases, enables
               the processor to perform other work in parallel, and reduces
               processor cache pollution caused by the transfer. The
               synchronization protocols reduce overhead and eliminate
               bottlenecks associated with synchronization primitives
               implemented using software on top of shared memory.
               Simulations of scientific applications running on FLASH show
               that, in many cases, synchronization support improves
               performance and increases the range of machine sizes over
               which the applications scale. Our study shows that embedded
               programmability is a convenient approach for supporting block
               transfer and synchronization, and that the FLASH system
               design effectively supports this approach.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/759/CSL-TR-98-759.pdf

%R CSL-TR-98-760
%Z Mon, 25 May 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T High Performance Inter-Chip Signalling
%A Sidiropoulos, Stefanos
%D April, 1998
%X The achievable off-chip bandwidth of digital IC's is a
	       crucial and often limiting factor in the performance
of digital systems.  In intra-system interfaces where both latency
and bandwidth are important, source-synchronous parallel channels
have been adopted as the most effective solution.  This work investigates
receiver and clocking circuit design techniques for increasing the signalling 
rate and
robustness of such channels.
One of the main problems arising in the reception of high speed signals is the 
adverse effects of high frequency noise.  To alleviate these effects, a new
class of receiver structures that utilize current integration is proposed.
The integration of current on a capacitor based on the incoming signal 
polarity effectively averages the signal over its valid time period, therefore
filtering out high frequency noise.  An experimental transceiver prototype
utilizing current integrating receivers was designed and fabricated in a 0.8
(Mu)m CMOS technology.  The prototype achieves a signaling rate of 740 Mbps/pin
operating from a 3.3-V supply with a bit error rate of less than 10 (SUP -14).
The second major challenge of inter-chip communication is the design of clock
generation and synchronization circuits.  Delay locked loops are an attractive 
alternative to VCO-based phase locked loops due to their simpler design, 
intrinsic
stability, and absence of phase error accumulation.  One of their main problems
however is their limited phase capture range.  A dual loop architecture that 
eliminates this problem is proposed.  This architecture employs a core loop
to generate finely spaced clock edges, which are then used by a peripheral 
loop to generate the output clock through phase interpolation.  Due to its
digital control, the dual loop can offer great flexibility in the implementation
of phase acquisition algorithms.  A dual DLL prototype was fabricated in a 0.8
(Mu)m CMOS technology.  The prototype achieves 80KHz-400MHz operating range,
12-ps rms jitter and 0.4-ps/mV jitter supply sensitivity.  
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/760/CSL-TR-98-760.pdf

%R CSL-TR-98-755
%Z Fri, 06 Aug 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T ABSS v2.0: a SPARC Simulator 
%A Sunada, Dwight
%A Glasco, David
%A Flynn, Michael
%D April 1998
%X This paper describes various aspects of the augmentation-based SPARC
simulator (ABSS).  We discuss (1) the problems that we solved in porting AugMINT
to the SPARC platform to create ABSS, (2) the major sections of ABSS, and (3) 
the limitations
of ABSS.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/755/CSL-TR-98-755.pdf

%R CSL-TR-98-762
%Z Thu, 11 Jun 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Hardware/Software Co-Design of Run-Time Systems
%A Mooney, Vincent John III
%D June 1998
%X Trends in system-level design show a clear move towards
               core-based design, where processors, controllers and other
               proprietary cores are reused and constitute essential
               building blocks. Thus, areas such as embedded system design
               and system-on-a-chip design are changing dramatically,
               requiring new design methodologies and Computer-Aided Design
               (CAD) tools.
               This thesis presents a novel system-level scheduling
               methodology and CAD environment, the SERRA Run-Time Scheduler
               Synthesis and Analysis Tool. Unlike previous approaches to
               run-time scheduling, we split our run-time scheduler between
               hardware and software, as opposed to placing the scheduler
               all in one or the other. Thus, given an already partitioned
               input system specification in an HDL and a software language,
               SERRA automatically generates a run-time scheduler partly in
               hardware and partly in software, for a target architecture of
               a microprocessor core together with multiple hardware cores
               or modules.
               A heuristic scheduling algorithm solves for priorities of
               software tasks executing on a single microprocessor with a
               custom priority scheduler, interrupt service routine, and
               context switch code. Real-time analysis takes into account
               the split hardware/software implementation both of the
               scheduler and of the tasks. The scheduler supports standard
               requirements of both domains, such as relative timing
               constraints in hardware and semaphores in software.
               A designer who uses the SERRA CAD tool gains the advantage of
               efficient satisfaction of timing constraints for
               hardware/software systems within a framework that enables
               different hardware/software partitions to be quickly
               evaluated. Thus, a hardware/software partitioning tool could
               easily sit on top of SERRA, which would generate run-time
               systems for different hardware/software partitions chosen for
               evaluation. In addition, SERRA's more efficient design space
               exploration can improve time-to-market for a product.
               Finally, we present two case studies. First, we show a full
               analysis, synthesis, and simulation of a hardware/software
               implementation of a robotics control system for a PUMA arm.
               Second, we describe a sample prototype of the split run-time
               scheduler in an actual design, a force-feedback real-time
               Haptic robot. For this application, the hardware part of the
               scheduler was implemented on programmable logic communicating
               with software using a standard communication protocol.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/762/CSL-TR-98-762.pdf

%R CSL-TR-98-752
%Z Tue, 21 Jul 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Smoothness of Stationary Subdivision on Irregular Meshes
%A Z orin, Denis
%D January 1998
%X We derive necessary and sufficient conditions for tangent plane 
and C(superscript k)-continuity of stationary subdivision schemes near
extraordinary vertices.  Our criteria generalize most previously known
conditions.  We introduce a new approach to analysis of subdivision
surfaces based on the idea of the universal surface. Any subdivision
surface can be locally represented as a projection of the universal surface,
which is uniquely defined by the subdivision scheme.  This approach provides 
us with a more intuitive geometric understanding of subdivision near 
extraordinary vertices.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/752/CSL-TR-98-752.pdf

%R CSL-TR-98-764
%Z Tue, 21 Jul 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Method for Analysis of C(superscript 1)-Continuity of Subdivision Surfaces
%A Z orin, Denis
%D May 1998
%X A sufficient condition for C(superscript 1)-continuity of 
subdivision surfaces was proposed by Rief [17] and extended to a more
general setting in [22].  In both cases, the analysis of 
C(superscript 1)-continuity is reduced to establishing injectivity and
regularity of a characteristic map.  In all known proofs of C(superscript 1)-
continuity,
explicit representation of the limit surface on an annular region 
was used to establish injectivity.  We propose a new approach to this
problem: we show that for a general class of subdivision schemes, 
regularity can be inferred from the properties of a sufficiently close
linear approximation, and injectivity can
be verified by computing the index of a curve.  An additional advantage of
our approach is that it allows us to prove C(superscript 1)-continuity for all 
valences of vertices, rather than for an arbitrarily large, but finite
number of valences.  As an application, we use our method to analyze
C(superscript 1)-continuity of most stationary subdivision schemes known to us, 
including interpolating Butterfly and Modified Butterfly schemes, as well
as the Kobbelt's interpolating scheme for quadrilateral meshes. 
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/764/CSL-TR-98-764.pdf

%R CSL-TR-98-761
%Z Wed, 12 Aug 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An Output Encoding Problem and A Solution Technique
%A Mitra, Subhasish
%A LaNae, J. Avra
%A McCluskey, Edward J.
%D November 1997
%X We present a new output encoding problem as follows:
Given a specification table, such as a truth table or a finite state machine 
state table, where some of the outputs are specified in terms of 1s, 0s
and don't cares, and others are specified symbolically, determine a 
binary code for each symbol of the symbolically specified output column
such that the total number of output functions to be implemented after 
encoding the symbolic outputs and compacting the output columns is
minimum.  There are several applications of this output encoding problem,
one of which is to reduce the area overhead while implementing scan or
pseudo-random BIST in a circuit with one-hot signals.  This algorithm 
can also be used as a pre-processing step during FSM state encoding.
In this paper, we develop an exact algorithm to solve the above
problem, prove its correctness, analyze the worst case time complexity
of the algorithm and present experimental data to validate the claim that our
encoding strategy helps to reduce the area of a synthesized circuit.  In
addition, we have investigated the possibility of using elementary gates to
facilitate further merging of the output functions generated by the 
encoding bits with the output functions generated by the elementary gates.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/761/CSL-TR-98-761.pdf

%R CSL-TR-98-768
%Z Thu, 13 Aug 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T EJAVA - Causal Extensions for Java
%A Santoro, Alexandre
%A Mann, Walter
%A Madhav, Neel
%A Luckham, David
%D August 1998
%X Programming languages like Jave provide designers
with a variety of classes that simplify the process of building
multithreaded programs.  Though useful, especially in the creation
of reactive systems, multithreaded programs present challenging
problems such as race conditions and synchronization issues.
Validating these programs against a specification is also not
trivial since Java does not clearly indicate thread interaction.
These problems can be solved by modifying Java so that it 
produces computations, collections of events with both causal and
temporal ordering relations defined for them.  Specifically, the 
causal ordering is ideal for identifying thread interaction.  This
paper presents eJava, an extension to Java that is both event based
and causally aware, and shows how it simplifies the process of 
understanding and debugging multithreaded programs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/768/CSL-TR-98-768.pdf

%R CSL-TR-98-756
%Z Thu, 13 Aug 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Hardware-assisted Algorithms for Checkpoints
%A Sunada, Dwight
%A Glasco, David
%A Flynn, Michael
%D July 1998
%X We can classify the algorithms for establishing checkpoints on
distributed-shared-memory multiprocessors DSMMs into 3 broad classes: tightly
synchronized method TSM loosely synchronized method LSM, unsynchronized
method USM.  TSM-type algorithms force the immediate establishment of a 
checkpoint whenever a dependency between 2 processors arises.  LSM-type 
algorithms
record this dependency and, hence, do not require the immediate establishment
of a checkpoint if a dependency does arise; when a processor chooses to 
establish
a checkpoint, the processor will query the dependency records to determine
other processors that must also establish a checkpoint.  USM-type algorithms
allow a processor to establish a checkpoint without regard to any other 
processor.
Within this framework, we developed 4 hardware-based algorithms: distributed
recoverable shared memory (DRSM), DRSM for communication checkpoints (DRSM-C), 
DRSM with a 
hybrid method (DRSM-H), and DRSM with logs (DRSM-L).  DRSM-C is a TSM-type 
algorithm, and
DRSM and DRSM-H are LSM-type algorithms.  DRSM-L is a USM-type algorithm and is
the first of its kind for a tightly-coupled DSMM where hardware in the form of 
a directory maintains cache coherence.  We find that DRSM has the best 
performance in
terms of minimizing the impact of establishing checkpoints (or logs) on the 
running applications, but DRSM along with DRSM-C has the most expensive hardware 
requirements.  DRSM-L has the second best performance but has the least 
expensive
hardware requirement.  We conclude that DRSM-L is the best algorithm in terms of
cost and performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/756/CSL-TR-98-756.pdf

%R CSL-TR-98-769
%Z Mon, 24 Aug 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Resource Discovery in Ad hoc Networks
%A Tang, Diane
%A Chang, Chih-Yuan
%A Tanaka, Kei
%A Baker, Mary
%D August 1998
%X Much of the current research in mobile networking investigates how to
support a mobile user within an established infrastructure of routers 
and servers.  Ad hoc networks come into play when no such established
infrastructure exits.  This paper presents a two-stage protocol to solve
the resource discovery problem in ad hoc networks: how hosts discover
what resources are available in the network and how they discover how to use
the resources.  This protocol does not require any established servers or 
other infrastructure.  It only requires routing capabilities in the network.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/769/CSL-TR-98-769.pdf

%R CSL-TR-98-772
%Z Mon, 23 Nov 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Designing a Partitionable Multiplier
%A Lee, Hyuk-Jun
%A Flynn, Michael
%D October 1998
%X This report presents the design of a 64-bit integer
               multiplier core that can perform 32-bit or 16-bit parallel
               integer multiplications(PMUL) and 32-bit or 16-bit parallel
               integer multiplications followed by additions(PMADD). The
               proposed multiplier removes sign and constant bits from its
               core and projects them to the boundaries to minimize the
               complexity of base cells. It also adopts an array-of-arrays
               architecture with unequal array sizes by decoupling partial
               product generation from carry save addition. This makes it
               possible to achieve high speed for 64-bit multiplication. Two
               architectures, which are done in dual-rail domino, are tested
               for functionality in Verilog and simulated in HSPICE for TSMC
               0.35um process. The first architecture is capable of both
               PMUL and PMADD. The estimated delay is 4.9 ns (excluding a
               final adder) at 3.3V supply and 25c and its estimated area is
               6.5 mm*2. The estimated delay of the second architecture,
               only capable of PMUL, is 4.5 ns. Its estimated area is 5.2
               mm*2.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/772/CSL-TR-98-772.pdf

%R CSL-TR-98-773
%Z Wed, 16 Dec 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Scalable Services for Video-On-Demand 
%A Chan, Shuen-Han Gary 
%A Tobagi, Fouad A. 
%D December 1998
%X Video-on-demand (VOD) refers to video services in 
which users can request any video program from a server at any
time.  VOD has important applications in entertainment, education,
information, and advertising, such as movie-on-demand, distance
learning, home shopping, interactive news, etc.
   In order to provide VOD services accommodating a large number of 
video titles and 
concurrent users, a VOD system has to be scalable -- scalable in storage
and scalable in streaming capacity.  Our goal is to design such a 
system with low cost, low complexity, and offering high level of
service quality (in terms of, for example, user delay experienced or
user loss rate).
  Storage scalability is achieved by using a hierarchical storage
system, in which video files are stored in tertiary libraries or
jukeboxes and transferred to a secondary level (of magnetic or
optical disks) for display.  We address the design of such a system by
specifying the required architectural parameters (the bandwidth and
storage capacity in each level) and operational procedures (such as
request scheduling and file replacement schemes) in order to meet
certain performance goals.
   Scalability in streaming capacity can be achieved by means of 
request batching, in which requests for a video arriving within a
period of time are grouped together (i.e., "batched")
and served with a single multicast stream.  The goal here is to
achieve the trade-off between the multicasting cost and user delay
in the system.  We study a number of batching schemes (in terms
of user delay experienced, the number of users collected in each
batch, etc.), and how system profit can be maximized given user's
reneging behaviour.
   Both storage and streaming scalabilities can be achieved with a 
distributed servers architecture, in which video files are accessed
from servers distributed in a network.  We examine a number of caching 
schemes in terms of their requirements in storage and streaming 
bandwidth.  Given certain cost functions in storage and streaming, we address
when and how much a video file should be cached in order to minimize the system
cost.  We show that a distributed servers architecture can achieve great 
cost savings while offering users low start-up delay.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/773/CSL-TR-98-773.pdf

%R CSL-TR-98-775
%Z Tue, 22 Dec 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Design of High-Speed Serial Links in CMOS
%A Yang, Chih-Kong Ken
%D December 1998
%X Demand for bandwidth in serial links has been
               increasing as the communications industry
               demand higher quantity and quality of information.
               Whereas traditional gigabit per second links has
               been in bipolar or GaAs, this research aims to push
               the use of CMOS process technology in such links.  
               Intrinsic gate speed limitations are overcome by
               parallelizing the data.  The on-chip frequency is 
               maintained at a fraction (1/16) of the off-chip
               data rate.  Clocks with carefully controlled phases
               tapped from a local ring oscillator are driven to a
               bank of input samplers to convert the serial bit
               stream into parallel data.  Similarly, the overlap
               of multiple-phased clocks are used to synchronize the
               multiplexing of the parallel data onto the transmission
               line.  To perform clock/data recovery, data is further
               oversampled with finer phase separation and passed to 
               digital logic.  The digital logic operates upon the
               samples to detect transitions in the bit stream to
               track the bit boundaries.  This tracking can operate 
               at the cycle rate of the digital logic allowing
               robustness to systematic phase noise.  The challenge
               lies in the capturing of the high frequency data stream
               and generating low jitter, accurately spaced clock
               edges.  A test chip is built demonstrating the 
               transmission and recovery of a 4.0-Gb/s bit streams
               with < 10 (minus superscript 14) bit-error rate using a
               3x oversampled system in a 0.5-um MOSIS CMOS process.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/775/CSL-TR-98-775.pdf

%R CSL-TR-98-774
%Z Wed, 30 Dec 98 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Fault-Tolerant Systems in A Space Environment: The CRC ARGOS Project
%A Shirvani, Philip P.
%A McCluskey, Edward J.
%D December 1998
%X This report describes the ARGOS project at Stanford CRC.
               The primary goals of this project are to collect data on
               the errors that occur in digital integrated circuits in
               a space environment, to determine the tradeoffs between
               fault-avoidance and fault-tolerance, and to see if
               radiation hardening can be avoided by using fault
               tolerance techniques.  Our experiments will be carried
               out on two processor boards on the ARGOS experimental
               satellite.  One of the boards uses radiation-hardened
               components while the other uses only commercial 
               off-the-shelf (COTS) parts.  Programs and data can be
               uploaded to the boards during the mission.  This capability
               allows us to evaluate different software fault-tolerance
               techniques.
               This report reviews various error detection techniques.
               Software techniques that do not require any special
               hardware are discussed.  The framework of the software
               that we are developing for error data collection is presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/98/774/CSL-TR-98-774.pdf

%R CSL-TR-99-776
%Z Thu, 11 Feb 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Novel Checkpointing Algorithm for Fault Tolerance on a
               Tightly-Coupled Multiprocessor
%A Sunada, Dwight
%A Glasco, David
%A Flynn, Michael
%D January 1999
%X The tightly-coupled multiprocessor (TCMP), where specialized
               hardware maintains the image of a single shared memory,
               offers the highest performance in a computer system. In order
               to deploy a TCMP in the commercial world, the TCMP must be
               fault tolerant. Researchers have designed various
               checkpointing algorithms to implement fault tolerance in a
               TCMP. To date, these algorithms fall into 2 principal
               classes, where processors can be checkpoint dependent on each
               other. We introduce a new apparatus and algorithm that
               represents a 3rd class of checkpointing scheme. Our algorithm
               is distributed recoverable shared memory with logs (DRSM-L)
               and is the first of its kind for TCMPs. DRSM-L has the
               desirable property that a processor can establish a
               checkpoint or roll back to the last checkpoint in a manner
               that is independent of any other processor. In this paper, we
               describe DRSM-L, show the optimal value of its principal
               design parameter, and present results indicating its
               performance under simulation.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/776/CSL-TR-99-776.pdf

%R CSL-TR-99-777
%Z Thu, 11 Feb 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The Mobile People Architecture
%A Appenzeller, Guido
%A Lai, Kevin
%A Maniatis, Petros
%A Roussopoulos, Mema
%A Swierk, Edward
%A Z hao, Xinhua
%A Baker, Mary
%D January 1999
%X People are the outsiders in the current communications
               revolution. Computer hosts, pager terminals, and telephones
               are addressable entities throughout the Internet and
               telephony systems. Human beings, however, still need
               application-specific tricks to be identified, like email
               addresses, telephone numbers, and ICQ IDs. The key challenge
               today is to find people and communicate with them personally,
               as opposed to communicating merely with their possibly
               inaccessible machines---cell phones that are turned off, or
               PCs on faraway desktops.
               We introduce the Mobile People Architecture, designed to meet
               this challenge. The main goal of this effort is to put the
               person, rather than the devices that the person uses, at the
               endpoints of a communication session. This architecture
               introduces the concept of routing between people. To that
               effect, we define the Personal Proxy, which has a dual role:
               as a Tracking Agent, the proxy maintains the list of devices
               or applications through which a person is currently
               accessible; as a Dispatcher, the proxy directs communications
               and uses Application Drivers to massage communication bits
               into a format that the recipient can see immediately. It does
               all this while protecting the location privacy of the
               recipient from the message sender. Finally, we substantiate
               our architecture with ideas about a future prototype that
               allows the easy integration of new application protocols.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/777/CSL-TR-99-777.pdf

%R CSL-TR-99-778
%Z Tue, 09 Mar 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Analysis of HTTP/1.1 Performance on a Wireless Network
%A Cheng, Stephen
%A Lai, Kevin
%A Baker, Mary
%D February 1999
%X We compare the performance of HTTP/1.0 and 1.1 on a high
               latency, low bandwidth wireless network. HTTP/1.0 is 
               known to have low throughput and consume excessive network and
               server resources on today's graphics-intensive web pages.  A
               high latency, low bandwidth network only magnifies these 
problems.
               HTTP/1.1 was developed to remedy these problems.  We show that on
               a Ricochet wireless network, HTTP/1.1 doubles throughput over 
               HTTP/1.0 and decreases the number of packets sent by 60%.  
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/778/CSL-TR-99-778.pdf

%R CSL-TR-99-779
%Z Mon, 22 Mar 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T CHOKe - A simple approach for providing Quality of Service through
               stateless approximation of fair queueing 
%A Pan, Rong 
%A Prabhakar, Balaji 
%D March 1999
%X We consider the problem of providing a fair bandwidth allocation
               to each of n flows that share an outgoing link at a congested 
               router.  The buffer at the outgoing link is a simple FIFO, 
               commonly shared
               by packets belonging to the n flows.  We devise a simple packet
               dropping scheme, CHOKe, that discriminates against the flows 
               which
               submit more packets/sec than is allowed by their fair share.
               By doing this, the scheme aims to approximate the fair queueing 
               policy. 
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/779/CSL-TR-99-779.pdf

%R CSL-TR-99-780
%Z Mon, 19 Apr 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Coarse Grain Carry Architecture for FPGA
%A Lee, Hyuk-Jun
%A Flynn, Michael 
%D February 1999
%X In this report we investigated several methods to improve
               the performance of FPGA for general purpose computing.
               In the early stage of this research we identified the fine
               grain size of current FPGA as the major performance 
               bottleneck.  To increase the grain size, we introduced 
               coarse grain carry architecture that can increase the
               granularity of arithmetic operations including addition
               and multiplication.  We used throughput density as a
               cost/performance metric to justify the benefit of the new
               architecture.  We could achieve roughly up to 5 times larger
               throughput density for selected applications.  Along with that
               we also introduced a dual-rail carry structure to improve
               the performance of a carry chain, which usually set the cycle
               time of a FPGA design.  A carry select adder built from the 
               dual-rail carry structure reduces the carry chain delay by
               a factor of two.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/780/CSL-TR-99-780.pdf

%R CSL-TR-99-781
%Z Mon, 19 Apr 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T An Architecture for Distributed, Interactive, Multi-Stream, Multi-Participant 
Audio and Video
%A Schmidt, Brian K.
%D April 1999
%X Today's computer users are becoming increasingly sophisticated, 
               demanding richer and fuller machine interfaces.  This is
               evidenced by the fact that viewing and manipulating a single
               stream of full-size video along with its associated audio stream 
               is becoming commonplace.  However, multiple media streams will
               become a necessity to meet the increasing demands of future
               applications.  An example which requires multiple media streams
               is an application that supports multi-viewpoint audio and video,
               which allows users to observe a remote scene from many different
               perspectives so that a sense of immersion is experienced.  
               Although desktop audio and video open many exciting 
               possibilities,
               their use in a computer environment only becomes interesting
               when computational resources are expended to manipulate them in
               an interactive manner.  We feel that user interaction will also
               become increasingly complex.  In addition, future applications 
               will
               make significant demands on the network in terms of bandwidth, 
               quality of service guarantee, latency, and connection management.
               Based on these trends we feel that an architecture designed to 
               support 
               future multimedia applications must provide support for several 
               key
               features.  The need for numerous media streams is clearly the 
               next
               step forward in terms of creating a richer environment.  Support
               for non-trivial, fine-grain interaction with the media data is
               another important requirement, and distributing the system
               across a network is imperative so that multiple participants
               can become involved.  Finally, as a side effect of the network 
               and
               multi-participant requirements, integral support for and use
               of multicast will be a prime architectural component.  The goal 
               of our
               work is to design and implement a complete system architecture
               capable of supporting applications with these requirements.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/781/CSL-TR-99-781.pdf

%R CSL-TR-99-782
%Z Fri, 16 Apr 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T A Compiler for Creating Evolutionary Software and Application Experience
%A Schmidt, Brian K. 
%A Lam, Monica S.
%D April 1999
%X Recent studies have shown that significant amounts of
               value repetition occur in modern applications.
               Due to global initialized data, immediate values, address
               calculations, redundancy in external input, etc.;
               the same value is used at the same program point as much as
               80% of the time.  Naturally, attention has begun to focus on 
               how compilers and specialized hardware can take advantage of this 
               value locality.  Unfortunately, there is significant overhead
               associated with dynamically recognizing predictable values
               and optimizing for them; and all too, this cost dramatically
               outweighs the benefits.
               There are various levels at which value locality can be observed and
               used for optimization, ranging from register value re-use to
               function memorization.  We are concerned with predictability of
               program variable values across multiple runs of a given program.
               In this paper we present a complete system that automatically
               translates ordinary sequential programs into evolutionary
               software, software that evolves to improve its performance
               using execution information from previous runs.  This concept
               can have a significant impact on software engineering, as it 
               can be used to replace the manual performance tuning phase in the
               application development lifecycle.  Not only does it alleviate the
               developer from a tedious and error-prone task, but it also has
               the important side effect of keeping applications free from 
               obscure hand optimizations which muddle the code and make it
               difficult to maintain or port.  This concept can also be used
               to produce efficient applications where static performance tuning
               is not adequate. 
               Our system automatically identifies targets for program specializations
               and instruments the code to gather high-level profiling information.  
               Upon completion, the program automatically re-compiles itself
               when the new profile information suggests that it is profitable.  The
               programmer is completely unaware of this process, as the software
               tailors itself to its environment.  We have demonstrated the utility
               of our system by using it to optimize graphics applications that
               are built upon a general-purpose graphics library.  While much of this
               work is based on well-established techniques, this is the first
               practical system which takes advantage of predictability in a way
               such that the overhead does not overwhelm the benefit.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/782/CSL-TR-99-782.pdf

%R CSL-TR-99-784
%Z Tue, 03 Aug 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T The M-log-Fraction Transform (MFT) for Computer Arithmetic
%A Mencer, Oskar
%A Flynn, Michael J.
%A Morf, Martin
%D July 1999
%X State-of-the-art continued fraction(CF) arithmetic enables us
               to compute rational functions so that input and output values
               are represented as simple continued fractions. The main
               problem of previous work is the conversion between simple
               continued fractions and binary numbers.
               The M-log-Fraction Transform(MFT), introduced in this work,
               enables us to instantly convert between binary numbers and
               M-log-Fractions. Conversion is related to the distance
               between the '1's of the binary number. Applying
               M-log-Fractions to continued fraction arithmetic algorithms
               reduces the complexity of the CF algorithm to shift-and-add
               structures, and more specifically, digit-serial arithmetic
               algorithms for (homographic) rational functions.
               We show two applications of the MFT:
               (1) a high radix rational arithmetic unit computing
               (ax+b)/(cx+d) in a shift-and-add structure.
               (2) the evaluation of rational approximations (or continued
               fraction approximations) in a multiplication-based structure.
               In (1) we obtain algebraic formulations of the entire
               computation, including the next-digit-selection function. For
               high radix operation, we can therefore partition the
               selection table into arithmetic blocks, making high radix
               implementations feasible.
               (2) overlaps the final division of a rational approximation
               with the multiply-add iterations.
               The MFT bridges the gap between continued fractions and the
               binary number representation, enabling the design of a new
               class of efficient rational arithmetic units and efficient
               evaluation of rational approximations.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/784/CSL-TR-99-784.pdf

%R CSL-TR-99-785
%Z Fri, 06 Aug 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Checkpointing Apparatus and Algorithms for Fault-Tolerant
               Tightly-Coupled Multiprocessors
%A Sunada, Dwight
%D July 1999
%X The apparatus and algorithms for establishing checkpoints on
               a tightly-coupled multiprocessor (TCMP) fall naturally into
               three broad classes: tightly synchronized method, loosely
               synchronized method, and unsynchronized method. The
               algorithms in the class of the tightly synchronized method
               force the immediate establishment of a checkpoint whenever a
               dependency between two processors arises. The algorithms in
               the class of the loosely synchronized method record this
               dependency and, hence, do not require the immediate
               establishment of a checkpoint if a dependency does arise;
               when a processor chooses to establish a checkpoint, the
               processor will query the dependency records to determine
               other processors that must also establish a checkpoint. The
               algorithms in the class of the unsynchronized method allow a
               processor to establish a checkpoint without regard to any
               other processor. Within this framework, we develop four
               apparatus and algorithms: distributed recoverable shared
               memory (DRSM), DRSM for communication checkpoints (DRSM-C),
               DRSM with half of the memory (DRSM-H), and DRSM with logs
               (DRSM-L). DRSM-C is an algorithm in the class of the tightly
               synchronized method, and DRSM and DRSM-H are algorithms in
               the class of the loosely synchronized method. DRSM-L is an
               algorithm in the class of the unsynchronized method and is
               the first of its kind for a TCMP. DRSM-L has the best
               performance in terms of minimizing the impact of establishing
               checkpoints (or logs) on the running applications and has the
               least expensive hardware.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/785/CSL-TR-99-785.pdf

%R CSL-TR-99-786
%Z Fri, 12 Nov 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T High-Speed Interconnect Schemes for a Pipelined FPGA 
%A Lee, Hyuk-Jun
%A Flynn, Michael J. 
%D August 1999
%X This paper presents two high-speed interconnect
schemes for a pipelined FPGA utilizing a locally synchronized
postcharging technique.  By avoiding a global synchronized
clock, we reduce the power consumption significantly.  Through
postcharging the interconnect and overlapping the postcharging
delay with the logic delay, we successfully hide the
postcharge time.  The long channel devices reduce the area penalty
due to delay elements significantly.  The timing simulation is
done using Hspice for a TSMC 0.35 um and area
is measured by drawing key elements in MAGIC and using the area
model developed in [2]. The postcharge scheme shows a 30% delay
reduction over the precharge scheme and up to 310% and 230%
delay reductions over the conventiaonal NMOS pass transistor
scheme and the tri-state buffer scheme.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/786/CSL-TR-99-786.pdf

%R CSL-TR-99-788
%Z Fri, 12 Nov 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Managing Event Processing Networks
%A Perrochon, Louis 
%A Kasriel, Stephane
%A Luckham, David C.
%D October 1999
%X The technical report presents Complex Event
Processing, CEP is a fundamental new technology that will enable
the next generation of middleware based distributed applications. 
CEP gains information on distributed systems and uses this
knowledge for monitoring, failure analysis or prediction of
activities. 
A very promising route in CEP research is that
of Event Processing Networks, which is one of the main areas
of research of the Program and Analysis Group at Stanford
University. Event Processing Networks are one way of 
describing and building CEP, by successively filtering
meaningful information and aggregating the corresponding events
into higher levels of abstraction.
This report describes in detail the foundations and aims of
Complex Event Processing. Then we will introduce the concept of 
Event Processing. Finally, we will describe the architecture of
the CEP system.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/788/CSL-TR-99-788.pdf

%R CSL-TR-99-787
%Z Tue, 16 Nov 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T CHOKe - A stateless queue management scheme for
approximating fair bandwidth allocation
%A Pan, Rong
%A Prabhakar, Balaji
%A Psounis, Konstantios
%D September 1999
%X We investigate the problem of providing a fair bandwidth 
allocation to each of n flows that share an outgoing link at a congested router. 
The buffer at the outgoing link is a simple FIFO, commonly shared by packets 
belonging to the n flows. We devise a simple packet dropping scheme, CHOKe, 
that discriminates against the flows which submit more packets/sec than is 
allowed by their fair share. By doing this, the scheme aims to approximate
the fair queueing policy. Since it is stateless and easy to implement, CHOKe 
controls unresponsive or misbehaving flows with a minimum overhead.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/787/CSL-TR-99-787.pdf

%R CSL-TR-99-789
%Z Tue, 16 Nov 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Flexible Use of Memory for Replication/Migration in 
Cache-Coherent DSM Multiprocessors
%A Soundararajan, Vijayaraghavan
%D November 1999
%X Shared-memory multiprocessors are being used increasingly 
as compute servers. These systems enable efficient
usage of computing resources through the aggregation and tight coupling of 
CPUs, memory, and I/O. One popular design for such machines is a bus-based 
architecture. However, as processors get faster, the shared bus becomes a
bandwidth bottleneck. CC-NUMA (Cache-Coherent with Non-Uniform Memory Access 
time) machines remove this architectural limitation and provide a scalable shared-
memory architecture. One significant characteristic of the CC-NUMA architecture is that 
the latency to access remote data is considerably larger than the latency to 
accesslocal data. On such machines, good data locality can reduce memory stall time 
and is therefore critical for high performance.
In this thesis we study the various options available to system designers to 
transparently decrease the fraction of data misses serviced remotely. This work 
is done in the context of the Stanford FLASH multiprocessor. We utilize the
programmability of the FLASH memory controller to explore a number of techniques 
for improving data locality: base cache-coherence (CC); a Remote Access Cache 
(RAC), in which a portion of local memory is used to cache
remotely-allocated data at cache-line granularity; a Cache-Only Memory 
Architecture (COMA-F), in which all of local memory is used as a cache under 
hardware control; and OS-assisted page migration/replication (MigRep), in
which the operating system migrates or replicates pages according to observed 
cache miss patterns. We then propose a novel hybrid scheme, MIGRAC, that 
combines 
the benefits of RAC and MigRep. We evaluate complete implementations of these 
schemes on the same platform using compute-server workloads (including OS 
effects), 
thereby providing a more consistent and detailed evaluation than has been done 
before.
We find that a simple RAC can improve performance significantly over CC 
(up to 64% gains). COMA-F improves locality but its additional complexity limits 
its gains versus CC (only 14% improvement). MigRep performs well (up to 33% 
gains) 
but does not handle fine-grain sharing as effectively as RAC or COMA-F. Finally, 
our MIGRAC approach performs well relative to RAC (up to 57% faster) and MigRep 
(up to 24% faster) and is robust.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/789/CSL-TR-99-789.pdf

%R CSL-TR-99-783
%Z Mon, 29 Nov 99 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Optimum Instruction-level Parallelism (ILP) for Superscalar
               and VLIW Processors
%A Hung, Patrick
%A Flynn, Michael J.
%D July 1999
%X Modern superscalar and VLIW processors fetch, decode,
               issue, execute, and retire multiple instructions per
               cycle.  By taking advantage of instruction-level
               parallelism (ILP), processor performance can be
               improved substantially.  However, increasing the level 
               of ILP may eventually result in diminishing and
               negative returns due to control and data dependencies
               among subsequent instructions as well as resource 
               conflicts within a processor.  Moreover, the 
               additional ILP complexity can have significant 
               overload in cycle time and latency. 
               This technical report uses a generic processor
               model to investigate the optimum level of ILP
               for superscalar and VLIW processors.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/99/783/CSL-TR-99-783.pdf

%R CSL-TR-00-790
%Z Mon, 09 Oct 00 00:00:00 GMT
%I Stanford University, Computer Systems Laboratory
%T Reciprocal Approximation Theory with Table Compensation
%A Liddicoat, Albert A.
%A Flynn, Michael J.
%D January 2000 
%X [Sch93] demonstrates the reuse of a multiplier partial product
               array (PPA) to approximate higher order functions such as the
               reciprocal, division, and square root. Schwarz generalizes
               this technique to any higher order function that can be
               expressed as A*B=C. Using this technique, the height of the
               PPA increases exponentially to increase the result precision. 
               Schwarz added compensation terms within the PPA to reduce the 
               worst case error.
               This work investigates the approximation theory of higher order
               functions without the bounds of multiplier reuse. Additional
               techniques are presented to increase the worst case precision
               for a fixed height PPA.
               A compensation table technique is presented in this work. This
               technique combines the approximation computation with a 
               compensation table to produce a result with fixed precision.
               The area-time tradeoff for three design points is studied.
               Increasing the computation decreases the area needed to implement
               the function but also increases the latency.
               Finally, the applicability of this technique to the bipartite ROM 
               reciprocal table is discussed. We expect that this technique can 
be
               applied to the bipartite ROM reciprocal table to significantly
               reduce the hardware area needed at a minimal increase in latency.
               In addition, this work focuses on hardware reconfigurability and 
the
               ability of the hardware unit to be used to perform multiple 
               higher order functions efficiently.  The PPA structure can be
               used to approximate several higher order functions that can be
               expressed as a multiply.                
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/00/790/CSL-TR-00-790.pdf

%R CSL-TR-00-791
%Z Mon, 14 Feb 00 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Precision of Semi-Exact Redundant Continued Fraction
Arithmetic for VLSI
%A Mencer, Oskar
%A Morf, Martin
%A Flynn, Michael J. 
%D February 2000 
%X Continued fractions (CFs) enable straightforward 
               representation of elementary functions and rational
               approximations.  We improve the positional algebraic
               algorithm, which computes homographic functions.
               The improved algorithm for the linear fractional
               transformation produces exact results, given regular 
               continued fraction input. In case the input is in
               redundant continued fraction form, our improved linear
               algorithm increases the percentage of exact results
               with 12-bit state registers from 78% to 98%.  The
               maximal error of non-exact results is improved.
               Indeed, by detecting a small number of cases, we can
               add a final correction step to improve the guaranteed
               accuracy of non-exact results. We refer to the fact
               that a few results may not be exact as "Semi-Exact"
               arithmetic. We detail the adjustments to the positional
               algebraic algorithm concerning register overflow, the
               virtual singularities that occur during the computation,
               and the errors due to non-regular, redundant CF inputs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/00/791/CSL-TR-00-791.pdf

%R CSL-TR-00-792
%Z Tue, 15 Feb 00 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Performance of Data-Intensive Algorithms on FPGAs in an
Object-oriented Programming Environment
%A Mencer, Oskar
%A Morf, Martin
%A Flynn, Michael J. 
%D February 2000 
%X Recently, we see academic and industrial efforts to 
               combine traditional computing environments with
               reconfigurable logic. Each application, or part of an 
               application, has an optimal implementation within the
               design space of microprocessors, reconfigurable logic,
               and hardwired VLSI circuits. Programmability, Performance,
               and Power are the major metrics that have to be taken 
               into account when deciding between the available technologies.
               Performance advantages of FPGAs over processors for
               specific applications have been shown in previous research.
               We show the potential of current low-power FPGAs to
               outperform current state-of-the-art processors in 
               Performance over Power by more than half an order of
               magnitude.  Programmability remains a tough issue.
               As a starting point, we define a hardware object interface
               in C++, PAM-Blox.  PAM-Blox is an open, object-oriented
               environment for programming FPGAs that encourages design
               sharing and code reuse.  PAM-Blox simplifies the creation
               of optimized high-performance designs. Encouraging a 
               distributed effort to share hardware objects over the internet 
               in the spirit of open software, is a first step towards 
               improving the programmability of FPGAs.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/00/792/CSL-TR-00-792.pdf

%R CSL-TR-00-793
%Z Thu, 24 Feb 00 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T Allocation and Interface Synthesis Algorithms for Component-based Design 
%A Smith, James
%D February 2000
%X Since 1965, the size of transistors has been halved and their
               speed of operation has been doubled, every 18 to 24 months, a
               phenomenon known as Moore's Law. This has allowed rapid
               increases in the amount of circuitry that can be included on
               a single die. However, as the availability of hardware real
               estate escalates at an exponential rate, the complexity
               involved in creating circuitry that utilizes that real estate
               grows at an exponential, or higher, rate. Component-based
               design methodologies promise to reduce the complexity of this
               task and the time required to design integrated circuits by
               raising the level of abstraction at which circuitry is
               specified, synthesized, verified, or physically implemented.
               This thesis develops algorithms for synthesizing integrated
               circuits by mapping high-level specifications onto existing
               components. To perform this task, word- level polynomial
               representations are introduced as a mechanism for canonically
               and compactly representing the functionality of complex
               components. Polynomial representations can be applied to a
               broad range of circuits, including combinational, sequential,
               and datapath dominated circuits. They provide the basis for
               efficiently comparing the functionality of a circuit
               specification and a complex component. Once a set of existing
               components is determined to be an appropriate implementation
               of a specification, interfaces between these components must
               be designed. This thesis also presents an algorithm for
               automatically deriving an HDL model of an interface between
               two or more components given an HDL model of those
               components. The combination of polynomial representations and
               interface synthesis algorithms provides the basis for a
               component-based design methodology.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/00/793/CSL-TR-00-793.pdf

%R CSL-TR-00-804
%Z Tue, 05 Sep 00 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%T IdentiScape: Tackling the Personal Online Identity Crisis
%A Maniatis, Petros
%A Baker, Mary
%D June 2000
%X Traditional systems refer to a mobile person using the name or                  
address of that person's communication device. As personal communications
become more diverse and popular, this solution is no longer adequate, since 
mobile people  frequently move between different devices and use different 
communications applications.  This lack of identifiers for mobile people causes 
problems ranging from the inconvenient to the downright dangerous: to locate a 
person, callers must use potentially multiple email addresses, cell phone 
numbers,
land line phone numbers or instant messaging IDs; callers leave sensitive 
messages on shared voicemail boxes; and they send communications intended
for the previous owner of a telephone number to the next owner.  To solve this
naming problem, we should be able to name people as the ultimate endpoints
of personal communications, regardless of the applications or devices they use.
In this paper, we develop a naming scheme for mobile people: we derive its
requirements and describe its design and implementation in the context of
personal communications.  IdentiScape, our prototype personal naming scheme, 
includes a name service which provides globally available identifiers that 
persist
over time and an online identity repository service which can be locally owned 
and
managed.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/00/804/CSL-TR-00-804.pdf

%R CSL-TR-00-807
%Z Tue, 05 Sep 00 00:00:00 GMT 
%I Stanford University, Computer Systems Laboratory
%TSUIF Explorer: An Interactive and Interprocedural Parallelizer
%A Liao, Shih-Wei
%D August 2000
%X Shared-memory multiprocessors that use the latest microprocessors
           are becoming widely used both as compute servers and as desktop computers. But
           the difficulty in developing parallel software is a major obstacle to the 
           effective use of the multiprocessors to solve a single task.  To increase the productivity of
           multiprocessor programmers, we developed an interactive interprocedural 
           parallelizer called SUIF Explorer.  Our experience with SUIF Explorer also helps 
           to identify missing interprocedural analyses that can significantly improve an
           automatic parallelizer. As a parallel programming tool, the Explorer actively 
           guides the programmers in the parallelization process using a set of advanced 
           static and dynamic analyses and
           visualization techniques.  Our interprocedural program analyses provide high-
           quality information that restricts the need for user assistance.  The Explorer is also 
           the first tool to apply slicing analysis to aid the programmer in uncovering program 
           properties for interactive parallelization.  These static and dynamic analyses
           minimize the number of lines of code requiring programmer assistance to produce
           parallel codes for real-world applications.
          As a tool for finding missing compiler techniques, SUIF Explorer helps the 
          compiler researchers design the next-generation parallelizer.  Our experience 
          with the Explorer shows that interprocedural array liveness analysis is an enabler of 
          several important optimizations, such as privatization and array contraction.  
          We developed and evaluated an efficient context-sensitive and flow-sensitive 
          interprocedural array liveness algorithm and integrated it into the parallelizer.  We use the 
          liveness  information to enable contraction of arrays that are not live at loop 
          exits,  which results in a smaller memory footprint and better cache utilization.  The 
          resulting codes run faster on both uni- and multi- processors.
         Another key interprocdural analysis which we developed and evaluated is the
         array reduction analysis.  Our reduction algorithm extends beyond previous
         approaches in its ability to locate reductions to array regions, even in the 
         presence of arbitrarily complex data dependences.  To exploit the multiprocessors 
         effectively, the algorithm can locate interprocedural reductions, reduction operations that 
         span multiple procedures.  In summary, we successfully apply the Explorer to help the 
         user develop parallel codes effectively and to help the compiler researcher 
         develop the next-generation parallelizer.
%U ftp://reports.stanford.edu/pub/cstr/reports/csl/tr/00/807/CSL-TR-00-807.pdf

%R KSL-TR-89-68
%Z Mon, 25 Apr 94 00:00:00 GMT 
%I Stanford University, Department of Computer Science,
               Knowledge Systems Laboratory
%T The Parallel Solution of Classification Problems
%A Maegawa, Hirotoshi
%D April 1994
%X We developed a problem solving framework called ConClass
               capable of classifying continuous real-time problems
               dynamically and concurrently on a distributed system.
               ConClass provides an efficient development environment
               for describing and decomposing a classification problem
               and synthesizing solutions. In ConClass, designed
               concurrency of decomposed subproblems effectively
               corresponds to the actual distributed computation
               components. This scheme is useful for designing and
               implementing efficient distributed processing, making it
               easier to anticipate and evaluate the system behavior.
               ConClass system has an object replication feature in
               order to prevent a particular object from being
               overloaded. An efficient execution mechanism is
               implemented without using schedulers or synchronization
               schemes liable to be bottlenecks. In order to deal with
               an indeterminate amount of problem data, ConClass
               dynamically creates object networks to justify hypothesized
               solutions and thus achieves a dynamic load distribution.
               We confirmed the efficiency of parallel distributed
               processing and load balancing of ConClass with an
               experimental application.
%U ftp://reports.stanford.edu/pub/cstr/reports/ksl/tr/89/68/KSL-TR-89-68.pdf

%R NA-M-80-02
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T A generalized eigenvalue approach for solving Riccati
               equations
%A Van Dooren, Paul M.
%D July 1980
%X A numerically stable algorithm is derived to compute
               orthonormal bases for any deflating subspace of a regular
               pencil $\lambda$B-A. The method is based on an update of the
               QZ -algorithm, in order to obtain any desired ordering of
               eigenvalues in the quasi-triangular forms constructed by this
               algorithm.
               As applications we discuss a new approach to solve Riccati
               equations arising in linear system theory. The computation of
               deflating subspaces with specified spectrum in shown to be of
               crucial importance here.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/80/02/NA-M-80-02.pdf

%R NA-M-80-03
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Computation of zeros of linear multivariable systems
%A Emami-Naeini, Abbas
%A Van Dooren, Paul M.
%D July 1980
%X Several algorithms have been proposed in the literature for
               the computation of the zeros of a linear system described by
               a state-space model {$\lambda$I - A,B,C,D}. In this report we
               discuss the numerical properties of a new algorithm and
               compare it with some earlier techniques of computing zeros.
               The new approach to shown to handle both nonsquare and/or
               degenerate systems without difficulties whereas earlier
               methods would either fail or would require special treatment
               fo r these cases. The method is also shown to be backward
               stable in a rigorous sense. Several numerical examples are
               given in order to compare speed and accuracy of the algorithm
               with its nearest competitors.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/80/03/NA-M-80-03.pdf

%R NA-M-80-05
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Efficient solution of the biharmonic equation
%A Bjorstad, Petter E.
%D September 1980
%X A new method for the numerical solution of the first
               biharmonic problem in a rectangular region is outlined. The
               theoretical complexity of the method is $N^2$ + O(N) storage
               and O($N^2$) arithmetic operations. (In order to achieve a
               prescribed accuracy on an N by N grid.) Numerical results
               from a computer code that requires a$N^2$ + b$N^2$logN + O(N)
               operations with b << a, are presented using both a scalar and
               a vector computer. Extensions and some applications of the
               method for solving eigenvalue problems and certain nonlinear
               problems are mentioned.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/80/05/NA-M-80-05.pdf

%R NA-M-80-06
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T A new implementation of sparse Gaussian elimination
%A Schreiber, Robert S.
%D September 1980
%X An implementation of sparse ${LDL}^T$ and LU factorization
               and back-substitution, based on a new scheme for storing
               sparse matrices, is presented. The new method appears to be
               as efficient in terms of work and storage as existing
               schemes. It is more amenable to efficient implementation on
               fast pipelined scientific computers.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/80/06/NA-M-80-06.pdf

%R NA-M-80-08
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Rational Chebyshev approximation on the unit disk
%A Trefethen, Lloyd N.
%D October 1980
%X In a recent paper we showed that error curves in polynomial
               Chebyshev approximation of analytic functions on the unit
               disk tend to approximate perfect circles about the origin.
               Making use of a theorem of Caratheodory and Fejer, we derived
               in the process a method for calculating near-best
               approximations rapidly by finding the principal singular
               value and corresponding singular vector of a complex Hankel
               matrix. This paper extends these developments to the problem
               of Chebyshev approximation by rational functions, where
               non-principal singular values and vectors of the same matrix
               turn out to be required. The theory is based on certain
               extensions of the Caratheodory-Fejer result which are also
               currently finding application in the fields of digital signal
               processing and linear systems theory.
               It is shown among other things that if f($\epsilon z$) is
               approximated by a rational function of type (m,n) for
               $\epsilon$ > 0, then under weak assumptions the corresponding
               error curves deviate from perfect circles of winding number M
               + N + 1 by a relative magnitude O(${\epsilon}^{m+n+2}$) as
               $\epsilon\ \rightarrow\ 0$. The "CF approximation" that our
               method computes approximates the true best approximation to
               the same high relative order. A numerical procedure for
               computing such approximations is described and shown to give
               results that confirm the asymptotic theory. Approximation of
               $e^z$ on the unit disk is taken as a central computational
               example.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/80/08/NA-M-80-08.pdf

%R NA-M-80-09
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Finite-difference methods for singular perturbation and
               Navier-Stokes problems
%A Schreiber, Robert S.
%D November 1980
%X The linear equation $\epsilon u_{xx} + xu_x$ = 0, 0 < x < 1,
               is proposed as a model for investigating interesting features
               of the behavior of difference methods for realistic
               multidimensional nonlinear elliptic problems, especially
               Navier-Stokes problems. We give an analytic and experimental
               comparison of several difference schemes for this model
               problem. An unusual scheme for the Navier-Stokes equations is
               suggested by these results. An experiment shows that this
               scheme performs better than a more obvious one.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/80/09/NA-M-80-09.pdf

%R NA-M-81-11
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Bifurcation problems for discrete variational inequalities
%A Mittelmann, Hans Detlef
%D April 1981
%X The buckling of a beam or a plate which are subject to
               obstacles is typical for the variational inequalities that
               are considered here. Bifurcation is known to occur from the
               first eigenvalue of the linearized problem. For a
               discretization the bifurcation point and the bifurcating
               braches may be obtained by solving a constrained optimization
               problem. An algorithm is proposed and its convergence is
               proved. The buckling of a clamped beam subject to point
               obstacles is considered in the continuous case and some
               numerical results for this problem are presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/81/11/NA-M-81-11.pdf

%R NA-M-81-12
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Group velocity in finite difference schemes
%A Trefethen, Lloyd N.
%D April 1981
%X The relevance of group velocity to the behavior of finite
               difference models of time-dependent partial differential
               equations is surveyed and illustrated. Applications involve
               the propagation of wave packets in one and two dimensions,
               numerical dispersion, the behavior of parasitic waves, and
               the stability analysis of initial boundary-value problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/81/12/NA-M-81-12.pdf

%R NA-M-81-13
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Large time step shock-capturing techniques for scalar
               conservation laws
%A LeVeque, Randall J.
%D July 1981
%X For a scalar conservation law $u_t\ = {f(u)}_x\ with f"$ of
               constant sign, the first order upwind difference scheme is a
               special case of Godonov's method. The method is equivalent to
               solving a sequence of Riemann problems at each step and
               averaging the resulting solution over each cell in order to
               obtain the numerical solution at the next time level. The
               difference scheme is stable (and the solutions to the
               associated sequence of Riemann problems do not interact)
               provided the Courant number $\nu$ is less than 1. By allowing
               and explicitly handling such interactions, it is possible to
               obtain a generalized method which is stable for $\nu$ much
               larger than 1. In many cases the resulting solution is
               considerably more accurate than solutions obtained by other
               numerical methods. In particular, shocks can be correctly
               computed with virtually no smearing. The generalized method
               is rather unorthodox and still has some problems associated
               with it. Nonetheless, preliminary results are quite
               encouraging.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/81/13/NA-M-81-13.pdf

%R NA-M-81-14
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T An efficient algorithm for bifurcation problems of
               variational inequalities
%A Mittelmann, Hans Detlef
%D September 1981
%X For a class of variational inequalities on a Hilbert space
               $H$ bifurcating solutions exist and may be characterized as
               critical points of a functional with respect to the
               intersection of the level surfaces of another functional and
               a closed convex subset $K$ of $H$. In a recent paper we have
               used a gradient-projection type algorithm to obtain the
               solutions for discretizations of the variational
               inequalities. A related but Newton-based method is given
               here. Global and asymptotically quadratic convergence is
               proved. Numerical results show that it may be used very
               efficiently in following the bifurcating branches and that it
               compares favorably with several other algorithms. The method
               is also attractive for a class of nonlinear eigenvalue
               problems ($K = H$) for which it reduces to a generalized
               Rayleigh-quotient iteration. So some results are included for
               the path following in turning point problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/81/14/NA-M-81-14.pdf

%R NA-M-81-16
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Numerical methods based on additive splittings for hyperbolic
               partial differential equations
%A LeVeque, Randall J.
%A Oliger, Joseph E.
%D October 1981
%X We derive and analyze several methods for systems of
               hyperbolic equations with wide ranges of signal speeds. These
               techniques are also useful for problems whose coefficients
               have large mean values about which they oscillate with small
               amplitude. Our methods are based on additive splittings of
               the operators into components that can be approximated
               independently on the different time scales, some of which are
               sometimes treated exactly. The efficiency of the splitting
               methods is seen to depend on the error incurred in splitting
               the exact solution operator. This is analyzed and a technique
               is discussed for reducing this error through a simple change
               of variables. A procedure for generating the appropriate
               boundary data for the intermediate solutions is also
               presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/81/16/NA-M-81-16.pdf

%R NA-M-82-03
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Generalized iterative methods for semidefinite linear systems
%A Schreiber, Robert S.
%D June 1982
%X In this paper, we consider iterative solution procedures for
               solving singular linear systems Ax = b, b $\varepsilon$ Range
               (A) where A is an n by n, Hermitian, positive semidefinite
               matrix. Our aim is to consider variants of the block Jacobi,
               SOR, and SSOR iterations. The fundamental paper of Keller
               ([1965]) considers methods based on splittings A = B - C with
               B a nonsingular matrix. Here we allow B to be singular.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/82/03/NA-M-82-03.pdf

%R NA-M-83-01
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Stability analysis of finite difference schemes for the
               advection-diffusion equation
%A Chan, Tony F.
%D January 1983
%X We present a collection of stability results for finite
               difference approximations to the advection-diffusion equation
               $u_t\ = a u_x\ + b u_{xx}$. The results are for centered
               difference schemes in space and include explicit and implicit
               schemes in time up to fourth order and schemes that use
               different space and time discretizations for the advective
               and diffusive terms. The results are derived from a uniform
               framework based on the Schur-Cohn theory of Simple von
               Neumann Polynomials and are necessary and sufficient for the
               stability of the Cauchy problem. Some of the results are
               believed to be new.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/83/01/NA-M-83-01.pdf

%R NA-M-83-02
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Adaptive mesh refinement for hyperbolic partial differential
               equations
%A Berger, Marsha J.
%A Oliger, Joseph E.
%D March 1983
%X We present an adaptive method based on the idea of multiple,
               component grids for the solution of hyperbolic partial
               differential equations using finite difference techniques.
               Based upon Richardson-type estimates of the truncation error,
               refined grids are created or existing ones removed to attain
               a given accuracy for a minimum amount of work. Our approach
               is recursive in that fine grids can themselves contain even
               finer grids. The grids with finer mesh width in space also
               have a smaller mesh width in time, making this a mesh
               refinement algorithm in time and space. We present the
               algorithm, data structures and grid generation procedure, and
               conclude with numerical examples in one and two space
               dimensions.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/83/02/NA-M-83-02.pdf

%R NA-M-83-27
%Z Mon, 09 Oct 00 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T The use of pre-conditioning over irregular regions
%A Golub, Gene H.
%A Mayers, David F.
%D June 1983
%X Some ideas and techniques for solving elliptic pde.'s over
               irregular regions are discussed. The basic idea is to break
               up the domain into subdomains and then to use the
               pre-conditioned conjugate gradient method for obtaining the
               solution over the entire domain. The solution of Poisson's
               equation over a $T$-shaped region is described in some detail
               and a numerical example is given.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/83/27/NA-M-83-27.pdf

%R NA-M-84-30
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Mesh-independent spectra in the moving finite element
               equations
%A Wathen, Andrew J.
%D August 1984
%X We derive the Moving Finite Element (MFE) equations for the
               solution of a scalar evolutionary equation in $d$ space
               dimensions ($d \geq\ 1$) and introduce the elementwise
               approach to MFE. This approach yields a decomposition of the
               mesh- and solution-dependent matrix $A$ in the
               (semi-discretised) non-linear system of ordinary differential
               equations $A(y)y = g(y)$ which forms the basis for proofs of
               eigenvalue clustering. With a simple, specific block diagonal
               preconditioner, $D$, it is shown that the eivenvalue spectrum
               of the preconditioned MFE matrix $D^{-1} A$ is [$\frac{1}{2}
               , 1 + \frac{d}{2}$] independently of the mesh configuration,
               the solution and the number of nodes. A more specific result
               is established for the case $d$ = 1. These results guarantee
               extremely rapid solution techniques using, for example,
               conjugate gradient methods. We show how the analysis extends
               to systems of partial differential equations when a separate
               moving mesh is used for each component.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/84/30/NA-M-84-30.pdf

%R NA-M-85-32
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Simultaneous computation of stationary probabilities with
               estimates of their sensitivity
%A Golub, Gene H.
%A Meyer, Carl D. Jr.
%D March 1985
%X For an n-state finite, homogeneous, ergodic Markov chain with
               transition matrix P = [$p_{ij}$], the stationary distribution
               is the unique row vector $\pi$ satisfying $\pi P = \pi , \sum
               {\pi}_i\ = 1$. Letting $A_{n\times n}$ and $e_{n\times 1}$
               denote the matrices A = I - P and e = ${[1, 1, ..., 1]}^T$,
               the stationary distribution $\pi$ can be characterized as the
               unique solution to the linear system of equations defined by
               $\pi$A = 0 and $\pi$e = 1.
               The theory of finite Markov chains has long been a
               fundamental tool in the analysis of social and biological
               phenomena. More recently the ideas embodied in Markov chain
               models along with the analysis of a stationary distribution
               have proven to be useful in applications which do not fall
               directly into the traditional Markov chain setting. Some of
               these applications include the analysis of queueing networks
               (Kaufman [1984]), the analysis of compartmental ecological
               models (Funderlic and Mankin [1981]), and least squares
               adjustment of geodedic networks (Brandt [1983]). Recently,
               the behavior of the numerical solution of systems of
               nonlinear reaction-diffusion equations has been analyzed by
               making use of the stationary distribution of a finite Markov
               chain in conjunction with the concept of group matrix
               inversion (Galeone [1983]).
               An ergodic chain manifests itself in the transition matrix P
               which must be row stochastic and irreducible. Of central
               importance is the sensitivity of the stationary distribution
               $\pi$ to perturbations in the transition probabilities in P.
               The sensitivity of $\pi$ is most easily gauged by considering
               the transition probabilities in P to be differentiable
               functions. One approach, adopted by Conlisk [preprint, 1983],
               Schweitzer [1968], and Funderlic and Heath [1971] is to
               examine partial derivatives $\delta\pi /\delta p_{ij}$. Our
               strategy is to consider the transition probabilities
               $p_{ij}$(t) as differentiable functions of a single parameter
               t and study the stationary distribution $\pi$(t) as a
               function of t. We present a new and very simple formulation
               for the derivative, d$\pi$(t)/dt, of the stationary
               distribution directly in terms of the derivatives
               d$p_{ij}$(t)/dt and entries from $\pi$(t) and a matrix
               $A^#$(t), called the group inverse of A(t) = I - P(t). After
               the derivative d$\pi$(t)/dt has been obtained, we demonstrate
               its applicability by using it to deduce the relative
               sensitivity of a discrete Markov chain. This is followed by a
               first order perturbation analysis. Finally, it is
               demonstrated how a QR factorization can be used to
               simultaneously compute $\pi$ along with estimates which gauge
               the sensitivity of $\pi$ to perturbations in P.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/85/32/NA-M-85-32.pdf

%R NA-M-85-33
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Multitasking the conjugate gradient on the CRAY X-MP/48
%A Meurant, Gerard A.
%D August 1985
%X We show how to efficiently implement the preconditioned
               conjugate gradient method on a four processors computer CRAY
               X-MP/48. We solve block tridiagonal systems using block
               preconditioners well suited to parallel computation.
               Numerical results are presented that exhibit nearly optimal
               speed-up and high Mflops rates.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/85/33/NA-M-85-33.pdf

%R NA-M-86-36
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T The truncated SVD as a method for regularization
%A Hansen, Per Christian
%D October 1986
%X The truncated singular value decomposition (SVD) is
               considered as a method for regularization of ill-posed linear
               least squares problems. In particular, the truncated SVD
               solution is compared with the usual regularized solution.
               Necessary conditions are defined in which the two methods
               will yield similar results. This investigation suggests the
               truncated SVD as a favorable alternative to standard-form
               regularization in case of ill-conditioned matrices with a
               well-determined rank.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/86/36/NA-M-86-36.pdf

%R NA-M-86-37
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T A survey of matrix inverse eigenvalue problems
%A Boley, Daniel L.
%A Golub, Gene H.
%D November 1986
%X In this paper, we present a survey of some recent results
               regarding direct methods for solving certain symmetric
               inverse eigenvalue problems. The problems we discuss in this
               paper are those of generating a symmetric matrix, either
               Jacobi, banded, or some variation thereof, given only some
               information on the eigenvalues of the matrix itself and some
               of its principal submatrices. Much of the motivation for the
               problems discussed in this paper came about from an interest
               in the inverse Sturm-Liouville problem.
               A preliminary version of this report was issued as a
               technical report of the Computer Science Department,
               University of Minnesota, TR 86-20, May 1986.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/86/37/NA-M-86-37.pdf

%R NA-M-87-01
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T The convergence of inexact Chebyshev and Richardson iterative
               methods for solving linear systems
%A Golub, Gene H.
%A Overton, Michael L.
%D February 1987
%X The Chebyshev and second-order Richardson methods are
               classical iterative schemes for solving linear systems. We
               consider the convergence analysis of these methods when each
               step of the iteration is carried out inexactly. This has many
               applications, since a preconditioned iteration requires, at
               each step, the solution of linear system which may be solved
               inexactly using an "inner" iteration. We derive an error
               bound which applies to the general nonsymmetric inexact
               Chebyshev iteration. We show how this simplifies slightly in
               the case of a symmetric or skew-symmetric iteration, and we
               consider both the cases of underestimating and overestimating
               the spectrum. We show that in the symmetric case, it is
               actually advantageous to underestimate the spectrum when the
               spectral radius and the degree of inexactness are both large.
               This is not true in the case of the skew-symmetric iteration.
               We show how similar results apply to the Richardson
               iteration. Finally, we describe numerical experiments which
               illustrate the results and suggest that the Chevyshev and
               Richardson methods, with reasonable param eter choices, may
               be more effective than the conjugate gradient method in the
               presence of inexactness.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/87/01/NA-M-87-01.pdf

%R NA-M-87-02
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Estimates of eigenvalues for iterative methods
%A Golub, Gene H.
%A Kent, Mark D.
%D February 1987
%X We describe procedures for determining estimates of the
               eigenvalues of operators used in various iterative methods
               for the solution of linear systems of equations. We also show
               how to determine upper and lower bounds for the error in the
               approximate solution of linear equations using essentially
               the same information as that needed for the eigenvalue
               calculations. The methods described depend strongly upon the
               theory of moments and Gauss quadrature.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/87/02/NA-M-87-02.pdf

%R NA-M-87-04
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T The convergence rate of inexact preconditioned steepest
               descent algorithms for solving linear systems
%A Munthe-Kaas, Hans
%D March 1987
%X The steepest descent algorithm is a classical iterative
               method for solving a linear system Ax=b, where A is a
               positive definite symmetric matrix. A common way to
               accelerate an iterative scheme is to precondition the method,
               i.e. to solve a simpler system Mz=r in each stage of the
               iteration. We analyze the effect of solving the
               preconditioner inexactly. A lower bound for the convergence
               rate is derived, and we show under what conditions this lower
               bound is obtained. Finally we describe some numerical
               experiments which show that in practical situations the lower
               bound may be too pessimistic. An amusing result is that in
               some cases small errors may lead to $\underline{higher}$
               convergence rates than if the preconditioner is solved
               exactly!
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/87/04/NA-M-87-04.pdf

%R NA-M-87-05
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Some history of the conjugate gradient and Lanczos
               algorithms: 1948-1976
%A Golub, Gene H.
%A O'Leary, Dianne P.
%D June 1987
%X This manuscript gives some of the history of the conjugate
               gradient and Lanczos algorithms and an annotated bibliography
               for the period 1948-1976.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/87/05/NA-M-87-05.pdf

%R NA-M-87-06
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Numerical assessment of the validity of two-dimensional plate
               models
%A Miara, Bernadette
%D June 1987
%X The objective of this paper is to verify numerically the
               convergence of the solution to the three-dimensional problem
               of a clamped plate towards the solution to the corresponding
               "limit" two-dimensional problem when the thickness of the
               plate goes to zero.
               Standard finite elements discretization of the
               three-dimensional problem fails to show this convergence [M.
               Vidrascu, 1978] as they lead to ill-conditioned linear
               systems when the discretization parameter is of the order of
               the thickness. We will therefore use a spectral approximation
               of the solution of the three-dimensional problem.
               First, we shall review the three-dimensional and
               two-dimensional linear models of a clamped plate and give the
               convergence results obtained by P.-G. Ciarlet and P.
               Destuynder [1979], [1981]. Then we will discuss two kinds of
               spectral approximations: the Galerkin and Tau approximations.
               Finally we give the numerical results obtained by Tau
               approximation.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/87/06/NA-M-87-06.pdf

%R NA-M-89-01
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Iterative methods for cyclically reduced non-self-adjoint
               linear systems
%A Elman, Howard C.
%A Golub, Gene H.
%D February 1989
%X We study iterative methods for solving linear systems of the
               type arising from two-cyclic discretizations of
               non-self-adjoint two-dimensional elliptic partial
               differential equations. A prototype is the
               convection-diffusion equation. The methods consist of
               applying one step of cyclic reduction, resulting in a
               "reduced system" of half the order of the original discrete
               problem, combined with a reordering and a block iterative
               technique for solving the reduced system. For constant
               coefficient problems, we present analytic bounds on the
               spectral radii of the iteration matrices in terms of cell
               Reynolds numbers that show the methods to be rapidly
               convergent. In addition, we describe numerical experiments
               that supplement the analysis and that indicate that the
               methods compare favorably with methods for solving the
               "unreduced" system.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/89/01/NA-M-89-01.pdf

%R NA-M-89-03
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T The restricted singular value decomposition: properties and
               applications
%A De Moor, Bart L. R.
%A Golub, Gene H.
%D April 1989
%X The restricted singular value decomposition (RSVD) is the
               factorization of a given matrix, relative to two other given
               matrices. It can be interpreted as the ordinary singular
               value decomposition with different inner products in row and
               column spaces. Its properties and structure are investigated
               in detail as well as its connection to generalized eigenvalue
               problems, canonical correlation analysis and other
               generalizations of the singular value decomposition.
               Applications that are discussed include the analysis of the
               extended shorted operator, unitarily invariant norm
               minimization with rank constraints, rank minimization in
               matrix balls, the analysis and solution of linear matrix
               equations, rank minimization of a partitioned matrix and the
               connection with generalized Schur complements, constrained
               linear and total linear least squares problems, with mixed
               exact and noisy data, including a generalized Gauss-Markov
               estimation scheme. Two constructive proofs of the RSVD in
               terms of other generalizations of the ordinary singular value
               decomposition are provided as well.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/89/03/NA-M-89-03.pdf

%R NA-M-89-05
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Generalized singular value decompositions: a proposal for a
               standardized nomenclature
%A De Moor, Bart L. R.
%A Golub, Gene H.
%D April 1989
%X An alphabetic and mnemonic system of names for several matrix
               decompositions related to the singular value decomposition is
               proposed: the OSVD, PSVD, QSVD, RSVD, SSVD, TSVD. The main
               purpose of this note is to propose a standardization of the
               nomenclature and the structure of these matrix
               decompositions.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/89/05/NA-M-89-05.pdf

%R NA-M-89-06
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T On the structure and geometry of the product singular value
               decomposition
%A De Moor, Bart L. R.
%D May 1989
%X The product singular value decomposition is a factorization
               of two matrices, which can be considered as a generalization
               of the ordinary singular value decomposition, at the same
               level of generality as the quotient (generalized) singular
               value decomposition.
               A constructive proof of the product singular value
               decomposition is provided, which exploits the close relation
               with a symmetric eigenvalue problem. Several interesting
               properties are established.
               The structure and the non-uniqueness properties of the so
               called contragredient transformation, which appears as one of
               the factors in the product singular value decomposition, are
               investigated in detail. Finally, a geometrical interpretation
               of the structure is provided in terms of principal angles
               between subspaces.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/89/06/NA-M-89-06.pdf

%R NA-M-89-07
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Iterative methods for cyclically reduced non-self-adjoint
               linear systems II
%A Elman, Howard C.
%A Golub, Gene H.
%D June 1989
%X We perform an analytic and experimental study of line
               iterative methods for solving linear systems arising from
               finite difference discretizations of non-self-adjoint
               elliptic partial differential equations on two-dimensional
               domains. The methods consist of performing one step of cyclic
               reduction, followed by solution of the resulting reduced
               system by line relaxation. We augment previous analyses of
               one-line methods, and we derive a new convergence analysis
               for two-line methods, showing that both classes of methods
               are highly effective for solving the convection-diffusion
               equation. In addition, we compare the experimental
               performance of several variants of these methods, and we show
               that the methods can be implemented efficiently on parallel
               architectures.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/89/07/NA-M-89-07.pdf

%R NA-M-89-09
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T On generating polynomials which are orthogonal over several
               intervals
%A Golub, Gene H.
%A Fischer, Bernd
%D August 1989
%X We consider the problem of generating the recursion
               coefficients of orthogonal polynomials for a given weight
               function. The weight function is assumed to be the weighted
               sum of weight functions, each supported on its own interval.
               Some of these intervals may coincide, overlap or are
               contiguous. We discuss three algorithms. Two of them are
               based on modified moments, whereas the other is based on an
               explicit expression for the desired coefficients. Several
               examples, illustrating the numerical performance of the
               various methods, are presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/89/09/NA-M-89-09.pdf

%R NA-M-89-12
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Backward error assertions for checking solutions to systems
               of linear equations
%A Boley, Daniel L.
%A Golub, Gene H.
%A Makar, Samy R.
%A Saxena, Nirmal R.
%A McCluskey, Edward J.
%D November 1989
%X This paper presents an assertion scheme based on the backward
               error analysis for error detection in algorithms that solve a
               system of linear equations, Ax = b. This Backward Error
               Assertion Model can be easily instrumented in a Watchdog
               processor environment. The complexity of verifying assertions
               is O($n^2$) compared to the O($n^3$) complexity of algorithms
               solving Ax = b. Unlike other proposed error detection
               methods, this assertion model does not require any encoding
               of matrix A. Experimental results under various error models
               are presented to validate the effectiveness of these
               assertions.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/89/12/NA-M-89-12.pdf

%R NA-M-90-01
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Line iterative methods for cyclically reduced discrete
               convection-diffusion problems
%A Elman, Howard C.
%A Golub, Gene H.
%D February 1990
%X We perform an analytic and empirical study of line iterative
               methods for solving the discrete convection-diffusion
               equation. The methodology consists of performing one step of
               the cyclic reduction method, followed by iteration on the
               resulting reduced system using line orderings of the reduced
               grid. Two classes of iterative methods are considered: block
               stationary methods, such as the block Gauss-Seidel and SOR
               methods, and preconditioned generalized minimum residual
               methods with incomplete LU preconditioners. New analysis
               extends convergence bounds for constant coefficient problems
               to problems with separable variable coefficients. In
               addition, analytic results show that iterative methods based
               on incomplete LU preconditioners have faster convergence
               rates than block Jacobi relaxation methods. Numerical
               experiments examine additional properties of the two classes
               of methods, including the effects of direction of flow,
               discretization, and grid ordering on performance.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/90/01/NA-M-90-01.pdf

%R NA-M-90-06
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T The nonsymmetric Lanczos algorithm and controllability
%A Boley, Daniel L.
%A Golub, Gene H.
%D May 1990
%X We give a brief description of a non-symmetric Lanczos
               algorithm that does not require strict bi-orthogonality among
               the generated vectors. We show how the vectors generated are
               algebraically related to "Controllable Space" and "Observable
               Space" for a related linear dynamical system. The algorithm
               described is particularly appropriate for large sparse
               systems.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/90/06/NA-M-90-06.pdf

%R NA-M-90-07
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Adaptive Lanczos methods for recursive condition estimation
%A Ferng, William R.
%A Golub, Gene H.
%A Plemmons, Robert J.
%D June 1990
%X Estimates for the condition number of a matrix are useful in
               many areas of scientific computing, including: recursive
               least squares computations, optimization, eigenanalysis, and
               general nonlinear problems solved by linearization techniques
               where matrix modification techniques are used. The purpose of
               this paper is to propose an adaptive Lanczos estimator
               scheme, which we call ale, for tracking the condition number
               of the modified matrix over time. Applications to recursive
               least squares (RLS) computations using the covariance method
               with sliding data windows are considered. ale is fast for
               relatively small n - parameter problems arising in RLS
               methods in control and signal processing, and is adaptive
               over time, i.e., estimates at time t are used to produce
               estimates at time t + 1. Comparisons are made with other
               adaptive and non-adaptive condition estimators for recursive
               least squares problems. Numerical experiments are reported
               indicating that ale yields a very accurate recursive
               condition estimator.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/90/07/NA-M-90-07.pdf

%R NA-M-91-05
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Iterative solution of linear systems
%A Freund, Roland W.
%A Golub, Gene H.
%A Nachtigal, Noel M.
%D November 1991
%X Recent advances in the field of iterative methods for solving
               large linear systems are reviewed. The main focus is on
               developments in the area of conjugate gradient-type
               algorithms and Krylov subspace methods for non-Hermitian
               matrices.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/91/05/NA-M-91-05.pdf

%R NA-M-91-06
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T How to generate unknown orthogonal polynomials out of known
               orthogonal polynomials
%A Golub, Gene H.
%A Fischer, Bernd
%D November 1991
%X We consider the problem of generating the three-term
               recursion coefficients of orthogonal polynomials for a weight
               function $v(t) = r(t)w(t)$, obtained by modifying a given
               weight function $w$ by a rational function $r$. Algorithms
               for the construction of the orthogonal polynomials for the
               new weight $v$ in terms of those for the old weight $w$ are
               presented. All the methods are based on modified moments. As
               applications we present Gaussian quadrature rules for
               integrals in which the integrand has singularities close to
               the interval of integration, and the generation of orthogonal
               polynomials for the (finite) Hermite weight $e^{-t^{2}}$,
               supported on a finite interval [$-b,b$].
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/91/06/NA-M-91-06.pdf

%R NA-M-91-03
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Direct block tridiagonalization of single-input single-output
               systems
%A Golub, Gene H.
%A Kagstrom, Bo T.
%A Dooren, Paul M. Van
%D July 1991
%X In this paper we derive a direct method for block
               tridiagonalizing a single-input single-output system triple
               $/{A,b,c/}$. The method is connected to the nonsymmetric
               Lanczos procedure developed in [Wilkinson, 1965][Boley/Golub,
               1990][Boley/Elhay/Golub/Gutknecht, 1990] and also leads to
               canonical representations of such triples.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/91/03/NA-M-91-03.pdf

%R NA-M-91-04
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Fast iterative solution of stabilised Stokes systems. Part I:
               Using simple diagonal preconditioners
%A Wathen, Andrew J.
%A Silvester, David J.
%D October 1991
%X Mixed finite element approximation of the classical Stokes
               problem describing slow viscous incompressible flow gives
               rise to symmetric indefinite systems for the discrete
               velocity and pressure variables. Iterative solution of such
               indefinite systems is feasible and is an attractive approach
               for large problems. The use of stabilisation methods for
               convenient (but unstable) mixed elements introduces
               stabilisation parameters. We show how these can be chosen to
               obtain rapid iterative convergence.
               We propose a conjugate gradient-like method (the method of
               preconditioned conjugate residuals) which is applicable to
               symmetric indefinite problems, describe the effects of
               stabilisation on the algebraic structure of the discrete
               Stokes operator and derive estimates of the eigenvalue
               spectrum of this operator on which the convergence rate of
               the iteration depends. Here we discuss the simple case of
               diagonal preconditioning. Our results apply to both locally
               and globally stabilised mixed elements as well as to elements
               which are inherently stable. We demonstrate that convergence
               rates comparable to that achieved using the diagonally scaled
               conjugate gradient method applied to the discrete Laplacian
               are approachable for the Stokes problem.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/91/04/NA-M-91-04.pdf

%R NA-M-92-01
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T A look-ahead algorithm for the solution of general Hankel
               systems
%A Freund, Roland W.
%A Z ha, Hongyuan
%D January 1992
%X The solution of linear systems of equations with Hankel
               coefficient matrices can be computed with only $O(n^2)$
               arithmetic operations, as compared to $O(n^3)$ operations for
               the general case. However, the classical Hankel solvers
               require the nonsingularity of all leading principal
               submatrices of the Hankel matrix. The known extensions of
               these algorithms to general Hankel systems can handle only
               exactly singular submatrices, but not ill-conditioned ones,
               and hence they are numerically unstable. In this paper, a
               stable procedure for solving general nonsingular Hankel
               systems is presented, using a look-ahead technique to skip
               over singular or ill-conditioned submatrices. The proposed
               approach is based on a look-ahead variant of the nonsymmetric
               Lanczos process that was recently developed by Freund,
               Gutknecht, and Nachtigal. We first derive a somewhat more
               general formulation of this look-ahead Lanczos algorithm in
               terms of formally orthogonal polynomials, which then yields
               the look-ahead Hankel solver as a special case. We prove some
               general properties of the resulting look-ahead algorithm for
               formally orthogonal polynomials. These results are then
               utilized in the implementation of the Hankel solver. We
               report some numerical experiments for Hankel systems with
               ill-conditioned submatrices.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/01/NA-M-92-01.pdf

%R NA-M-92-02
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Recent advances in Lanczos-based iterative methods for
               nonsymmetric linear systems
%A Freund, Roland W.
%A Golub, Gene H.
%A Nachtigal, Noel M.
%D January 1992
%X In recent years, there has been a true revival of the
               nonsymmetric Lanczos method. On the one hand, the possible
               breakdowns in the classical algorithm are now better
               understood, and so-called look-ahead variants of the Lanczos
               process have been developed, which remedy this problem. On
               the other hand, various new Lanczos-based iterative schemes
               for solving nonsymmetric linear systems have been proposed.
               This paper gives a survey of some of these recent
               developments.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/02/NA-M-92-02.pdf

%R NA-M-92-04
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T On the convergence of line iterative methods for cyclically
               reduced non-symmetrizable linear systems
%A Elman, Howard C.
%A Golub, Gene H.
%A Starke, Gerhard C.
%D May 1992
%X We derive analytic bounds on the convergence factors
               associated with block relaxation methods for solving the
               discrete two-dimensional convection-diffusion equation. The
               analysis applies to the reduced systems derived when one step
               of block Gaussian elimination is performed on red-black
               ordered two-cyclic discretizations. We consider the case
               where centered finite difference discretization is used and
               one cell Reynolds number is less than one in absolute value
               and the other is greater than one. It is shown that line
               ordered relaxation exhibits very fast rates of convergence.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/04/NA-M-92-04.pdf

%R NA-M-92-05
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Adaptive Chebyshev iterative methods for nonsymmetric linear
               systems based on modified moments
%A Calvetti, Daniela
%A Golub, Gene H.
%A Reichel, Lothar
%D May 1992
%X Large, sparse nonsymmetric systems of linear equations with a
               matrix whose eigenvalues lie in the right half plane may be
               solved by an iterative method based on Chebyshev polynomials
               for an interval in the complex plane. Knowledge of the convex
               hull of the spectrum of the matrix is required in order to
               choose parameters upon which the iteration depends. Adaptive
               Chebyshev algorithms, in which these parameters are
               determined by using eigenvalue estimates computed by the
               power method or modifications thereof, have been described by
               Manteuffel [1978]. This paper presents adaptive Chebyshev
               iterative methods, in which eigenvalue estimates are computed
               from modified moments determined during the iterations. The
               computation of eigenvalue estimates from modified moments
               requires less computer storage than when eigenvalue estimates
               are computed by a power method and yields faster convergence
               for many problems.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/05/NA-M-92-05.pdf

%R NA-M-92-09
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T An implementation of a generalized Lanczos procedure for
               structural dynamic analysis on distributed memory computers
%A Mackay, David R.
%A Law, Kincho H.
%D August 1992
%X This paper describes a parallel implementation of a
               generalized Lanczos procedure for structural dynamic analysis
               on a distributed memory parallel computer. One major cost of
               the generalized Lanczos procedure is the factorization of the
               (shifted) stiffness matrix and the forward and backward
               solution of triangular systems. In this paper, we discuss
               load assignment of a sparse matrix and propose a strategy for
               inverting the principal block submatrix factors to facilitate
               the forward and backward solution of triangular systems. We
               also discuss the different strategies in the implementation
               of mass matrix-vector multiplication on parallel computers
               and how they are used in the Lanczos procedure. The Lanczos
               procedure implemented includes partial and external selective
               reorthogonalizations and spectral shifts. Experimental
               results are presented to illustrate the effectiveness of the
               parallel generalized Lanczos procedure. The issues of
               balancing the computations among the basic steps of the
               Lanczos procedure on distributed memory computers are
               discussed.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/09/NA-M-92-09.pdf

%R NA-M-92-10
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T A parallel row-oriented sparse solution method for finite
               element structural analysis
%A Law, Kincho H.
%A Mackay, David R.
%D August 1992
%X This paper describes a parallel implementation of $LDL^T$
               factorization on a distributed memory parallel computer.
               Specifically, the parallel $LDL^T$ factorization procedure is
               based on a row-oriented sparse storage scheme. In addition, a
               strategy is proposed for the parallel solution of triangular
               systems of equations. The strategy is to compute the inverses
               of the dense principal diagonal block submatrices of the
               factor $L$, stored in a row-oriented structure. Experimental
               results for a number of finite element models are presented
               to illustrate the effectiveness of the parallel solution
               schemes.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/10/NA-M-92-10.pdf

%R NA-M-92-11
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T A new approach for solving perturbed symmetric eigenvalue
               problems
%A Carey, Cheryl M. M.
%A Chen, Hsin-Chu
%A Golub, Gene H.
%A Sameh, Ahmed H.
%D September 1992
%X In this paper, we present a new approach for the solution to
               a series of slightly perturbed symmetric eigenvalue problems
               $(A + BS_{i}B^{T}) x = \lambda\ x, 0 \leq\ i \leq\ m$, where
               $A = A^T\ \in\ R^{n\times n}, B \in\ R^{n\times p}$, and
               $S_i\ = S_{i}^{T}\ \in\ R^{p\times p}, p \ll\ n$. The matrix
               $B$ is assumed to have full column rank. The main idea of our
               approach lies in a specific choice of starting vectors used
               in the block Lanczos algorithm so that the effect of the
               perturbations is confined to lie in the first diagonal block
               of the block tridiagonal matrix that is produced by the block
               Lanczos algorithm. Subsequently, for the perturbed eigenvalue
               problems under our consideration, the block Lanczos scheme
               needs be applied to the original (unperturbed) matrix only
               once and then the first diagonal block updated for each
               perturbation so that for low-rank perturbations, the
               algorithm presented in this paper results in significant
               savings. Numerical examples based on finite element vibration
               analysis illustrated the advantages of this approach.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/11/NA-M-92-11.pdf

%R NA-M-92-12
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Matrix shapes invariant under the symmetric QR algorithm
%A Arbenz, Peter
%A Golub, Gene H.
%D September 1992
%X It is shown, which zero patterns of symmetric matrices are
               preserved under the QR algorithm.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/12/NA-M-92-12.pdf

%R NA-M-92-13
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T The canonical correlations of matrix pairs and their
               numerical computation
%A Golub, Gene H.
%A Z ha, Hongyuan
%D September 1992
%X This paper is concerned with the analysis of canonical
               correlations of matrix pairs and their numerical computation.
               We first develop a decomposition theorem for matrix pairs
               having the same number of rows which explicitly exhibits the
               canonical correlations. We then present a perturbation
               analysis of the canonical correlations, which compares
               favorably with the classical first order perturbation
               analysis. Then we propose several numerical algorithms for
               computing the canonical correlations of general matrix pairs;
               emphasis is placed on the case of large sparse or structured
               matrices.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/13/NA-M-92-13.pdf

%R NA-M-92-14
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Cyclic reduction/multigrid
%A Golub, Gene H.
%A Tuminaro, Ray S.
%D September 1992
%X We consider the use of the multigrid method in conjunction
               with a cyclic reduction preconditioner for
               convection-diffusion equations. This preconditioner
               corresponds to algebraically eliminating all the unknowns
               associated with the red points on a standard mesh colored in
               a checker-board fashion. It is shown that the multigrid
               method applied to the resulting operator often converges much
               faster than when applied to the original equations. Fourier
               analysis of a constant coefficient model problem as well as
               numerical results for nonconstant coefficient examples are
               used to validate the conclusions.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/14/NA-M-92-14.pdf

%R NA-M-92-15
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Fast solution of the Helmholtz equation with radiation
               condition by imbedding
%A Ernst, Oliver G.
%D October 1992
%X No abstract available.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/15/NA-M-92-15.pdf

%R NA-M-92-16
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Model problems in numerical stability theory for initial
               value problems
%A Stuart, Andrew M.
%A Humphries, Antony R.
%D November 1992
%X In the past numerical stability theory for initial value
               problems in ordinary differential equations has been
               dominated by the study of problems with essentially trivial
               dynamics. Whilst this has resulted in a coherent and
               self-contained body of knowledge, it has not thoroughly
               addressed the problems of real interest in applications.
               Recently there have been a number of studies of numerical
               stability for wider classes of problems admitting more
               complicated dynamics. This on-going work is unified and
               possible directions for future work are outlined. In
               particular, striking similarities between this new developing
               stability theory and the classical non-linear stability
               theory are emphasised.
               The classical theories of $A$, $B$, and algebraic stability
               for Runge-Kutta methods are briefly reviewed, and it is
               emphasised that the classes of equations to which these
               theories apply - linear decay and contractive problems - only
               admit trivial dynamics. Four other categories of equations -
               gradient, dissipative, conservative and Hamiltonian systems -
               are considered. Relationships and differences between the
               possible dynamics in each category, which range from multiple
               competing equilibria to fully chaotic solutions, are
               highlighted and it is stressed that the wide range of
               possible behaviour allows a large variety of applications.
               Runge-Kutta schemes which preserve the dynamical structure of
               the underlying problem are sought, and indications of a
               strong relationship between the developing stability theory
               for these new categories and the classical existing stability
               theory for the older problems are given. Algebraic stability,
               in particular, is seen to play a central role. The effects of
               error control are considered, and multi-step methods are
               discussed briefly. Finally, various open problems are
               described.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/16/NA-M-92-16.pdf

%R NA-M-92-18
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T An analysis of local error control for dissipative,
               contractive and gradient dynamical systems
%A Stuart, Andrew M.
%A Humphries, Antony R.
%D November 1992
%X The dynamics of numerical methods with local error control
               are studied for three classes of ordinary differential
               equations: dissipative, contractive and gradient systems.
               Dissipative dynamical systems are characterised by having a
               bounded absorbing set $\cal B$ which all trajectories
               eventually enter and remain inside. The exponentially
               contractive problems studied have a unique, globally
               attracting equilibrium point and thus they are also
               dissipative since the absorbing set $\cal B$ may be chosen to
               be a ball of arbitrarily small radius around the equilibrium
               point. The gradient systems studied are those for which the
               set of equilibria comprises isolated points and all
               trajectories are bounded so that each trajectory converges to
               an equilibrium point as $t \rightarrow\ \infty$. If the set
               of equilibria is bounded then the gradient systems are also
               dissipative. The aim is to find conditions under which
               numerical methods with local error control replicate these
               large-time dynamical features. The results are proved without
               recourse to asymptotic expansions for the truncation error.
               Standard embededed Runge-Kutta pairs are analysed together
               with several non-standard error control strategies. These
               non-standard strategies are easy to implement and have
               desirable properties within certain of the classes of
               problems studied. Both error per step and error per unit step
               strategies are considered. Certain embedded pairs are
               identified for which the sequence generated can be viewed as
               coming from a small perturbation of an algebraically stable
               scheme, with the size of the perturbation proportional to the
               tolerance $\tau$. Such embedded pairs are defined to be
               algebraically stable and explicit algebraically stable pairs
               are identified. Conditions on the tolerance $\tau$ are
               identified under which appropriate discrete analogues of the
               properties of the underlying differential equation may be
               proved for certain algebraically stable embedded pairs. In
               particular, it is shown that for dissipative problems the
               discrete dynamical system has an absorbing set ${\cal
               B}_{\tau}$ and is hence dissipative. For exponentially
               contractive problems the radius of ${\cal B}_{\tau}$ is
               proved to be proportional to a positive power of $\tau$. For
               gradient systems the numerical solution enters and remains in
               a small ball about one of the equilibria and the radius of
               the ball $\rightarrow$ 0 as $\tau\ \rightarrow$ 0. Thus the
               local error control mechanisms confer desirable global
               properties on the numerical solution. It is shown that for
               error per unit step strategies the conditions on the
               tolerance $\tau$ are independent of initial data whilst for
               error per step strategies the conditions are initial data
               dependent. Thus error per unit step strategies are
               considerably more robust.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/18/NA-M-92-18.pdf

%R NA-M-92-20
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T Use of linear algebra kernels to build an efficient finite
               element solver
%A Elman, Howard C.
%A Lee, Dennis K.-Y.
%D December 1992
%X For scientific codes to achieve good performance on computers
               with hierarchical memories, it is necessary that the ratio of
               memory references to arithmetic operations be low. In this
               paper, we show that Level 3 BLAS linear algebra kernels can
               be used to satisfy this requirement to produce an efficient
               implementation of a parallel finite element solver on a
               shared memory parallel computer with a fast cache memory.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/20/NA-M-92-20.pdf

%R NA-M-92-21
%Z Sun, 28 Jan 96 00:00:00 GMT 
%I Stanford University, Department of Computer Science, Numerical
               Analysis Project
%T On the error computation for polynomial based iteration
               methods
%A Fischer, Bernd
%A Golub, Gene H.
%D December 1992
%X In this note we investigate the Chebyshev iteration and the
               conjugate gradient method applied to the system of linear
               equations $Ax = f$ where $A$ is a symmetric, positive
               definite matrix. For both methods we present algorithms which
               approximate during the iteration process the $kth$ error
               $\varepsilon_k = \l x - x_k\l A$. The algorithms are based on
               the theory of modified moments and Gaussian quadrature. The
               proposed schemes are also applicable for other polynomial
               iteration schemes. Several examples, illustrating the
               performance of the described methods, are presented.
%U ftp://reports.stanford.edu/pub/cstr/reports/na/m/92/21/NA-M-92-21.pdf

%R SEL-TR-83-003
%Z Thu, 03 Dec 98 00:00:00 GMT 
%I Stanford University, Stanford Electronic Laboratories
%T Timing Models for MOS Circuits 
%A Horowitz, Mark A.
%D December 1983
%X
Performance is an important aspect of integrated circuit design, and depends
in part on the speed of the underlying circuits. This thesis presents a new 
method
of analyzing MOS circuit delay, based on a single-time-constant approximation.
The timing models characterize the circuit by a single parameter, which depends
on the resistance and capacitance of the circuit elements. To ensure the single-
time-constant approximation is valid for a particular circuit, the timing models
provide both an estimate and bounds for the output waveform. For circuits where
the bounds are poor, an improved timing model is derived. These simple models
provide insight about circuit performance issues, as well as determining the 
circuit
delay.
The timing models are first developed for linear networks and then are extended
to model MOS circuits driven by a step input. By using the single-time-constant
approximation, the output waveform of a complex MOS circuit can be modelled by
the output of a circuit consisting of a single MOS transistor and a single 
capacitor.
Finally, a new circuit model of a gate is used to derive the output waveform
of a circuit driven by an arbitrary input. The resulting timing model does not
depend strongly on the shape of the input: the output waveform only depends on
the input's slope at the gate's switching voltage.
%U ftp://reports.stanford.edu/pub/cstr/reports/sel/tr/83/003/SEL-TR-83-003.pdf