# COMPUTER SYSTEMS LABORATORY STANFORD UNIVERSITY STANFORD, CA 94305455 # Optimization of Combinational Logic Circuits Based on Compatible Gates Maurizio Damiani Jerry Chih-Yuan Yang Giovanni De Micheli Technical Report: CSL-TR-93-584 September 1993 This research is sponsored by NSF and DEC under a PYI award and by ARPA and NSF under contract MIP 9115432. # Optimization of Combinational Logic Circuits Based on Compatible Gates Maurizio Damiani \* Jerry Chih- Yuan Yang Giovanni De Micheli Technical Report: CSL-TR-93-584 September, 1993 Computer Systems Laboratory Departments of Electrical Engineering and Computer Science Stanford University, Stanford CA 94305-4055 #### **Abstract** This paper presents a set of new techniques for the optimization of multiple-level combinational Boolean networks. We describe first a technique based upon the selection of appropriate multiple-output subnetworks (consisting of so-called compatible gates) whose local functions can be optimized simultaneously. We then generalize the method to larger and more arbitrary subsets of gates. Because simultaneous optimization of local functions can take place, our methods are more powerful and general than Boolean optimization methods using don't cares, where only single-gate optimization can be performed. In addition, our methods represent a more efficient alternative to optimization procedures based on Boolean relations because the problem can be modeled by a unate covering problem instead of the more difficult binate covering problem. The method is implemented in program ACHILLES and compares favorably to Sis. Key Words and Phrases: Combinational logic synthesis, don't care methods. <sup>\*</sup>Now with the Dipartimento di Elettronica ed Informatica, Universitá di Padova, Via Gradenigo 6/A, Padova, Italy. Copyright © 1993 by Maurizio Damiani and Jerry Chih-Yuan Yang and Giovanni De Micheli # **Contents** | 1 | Introduction | 1 | |---|---------------------------------------------------|-----------| | 2 | Terminology 2.1 Previous Work | 2 | | 3 | Compatible Gates 3.1 Optimizing Compatible Gates | <b>7</b> | | 4 | Finding Compatible Gates | 12 | | 5 | Unate Optimization 5.1 Optimizing Unate Subsets | <b>17</b> | | 6 | Implementation and Results | 22 | | 7 | Conclusion | 23 | | 8 | Acknowledgement | 23 | # Optimization of Combinational Logic Circuits Based on Compatible Gates Maurizio Damiani Jerry Chih- Yuan Yang Giovanni De Micheli Computer Systems Laboratory Departments of Electrical Engineering and Computer Science Stanford University, Stanford CA 94305 #### 1 Introduction Logic synthesis has been traditionally divided into two-level and multiple-level synthesis. Two-level synthesis has been intensely researched from theoretical and engineering perspectives, and efficient algorithms for exact [1, 2, 3, 4] and approximate [5, 6, 7] solutions are available. Exact optimization algorithms for multiple-level logic networks have also been considered [8]. They are, however, generally impractical even for medium-sized networks. For this reason, many efficient approximation algorithms have been developed over the past decade. Such algorithms can be classified according to the algebraic/Boolean type of operations they perform. Algebraic techniques, such as factoring and kerneling, are described in [9]. As algebraic methods do not take full advantage of the properties of Boolean algebra, a spectrum of Boolean optimization techniques has been developed in parallel. Such techniques consist mainly of iteratively refining an initial network by identifying subnetworks to be optimized, deriving their associated degrees of freedom (expressed by so-called **don't care conditions)**, and replacing such subnetworks by simpler, optimized ones. The independent optimization of the local function of a network, called **single-gate optimization**, lies at one end of the spectrum. It has been shown [10, 11] that the degrees of freedom associated with a single gate can be represented by a **don't care set**. Once this set is obtained, two-level synthesis algorithms can be used to optimize the subnetwork [11]. The concurrent optimization of several local functions, called **multiple-gate optimization**, lies at the other end of the spectrum. Multiple-gate optimization has been shown to offer potentially better quality networks as compared to single-gate optimization because of the additional degrees of freedom associated with the re-design of larger blocks of logic. Exact methods for multiple-gate optimization, first analyzed in [12], have been shown to best exploit the degrees of freedom. Unfortunately these methods suffer from two major disadvantages. First, even for small subnetworks, the number of primes that have to be derived can be remarkably large; second, given the set of primes, it entails the solution of an often complex *binate covering* **problem**, for which efficient algorithms are still the subject of investigation. As a result, the overall efficiency of the method is limited, and only relatively small networks can currently be handled. Heuristic approximations to multiple-gate optimization include the use of **compatible don't cares** [10] which allows us to extend **don't care** based optimization to multiple functions by suitably restricting the individual **don't care** sets associated with each function. Although such methods are applicable to large networks, the restriction placed on **don't care** sets reduces the degrees of freedom and hence possibly the quality of the results. The binate nature of the covering problem arises essentially from the arbitrariness of the subnetwork selected for optimization. In this paper, we develop alternative techniques for the optimization of multiple-output subnetworks. These techniques are based upon a temporary transformation of a network into an internally unate one, and on an accurate choice of the subnetworks to be optimized. The difficult binate covering step is avoided, and yet an optimization quality superior to don't care -based methods with comparable efficiency is achieved because multiple local functions can be optimized simultaneously. To this regard, first we introduce the notion of compatible set of gates as a subset of gates whose optimization can be solved exactly by classical two-level synthesis algorithms. We show that the simultaneous optimization of compatible gates allows us to reach optimal solutions not achievable by conventional don't care methods. We then leverage upon these results and present an algorithm for the optimization of more general subnetworks in an internally unate network. The algorithms have been implemented and tested on several benchmark circuits, and the results in terms of literal savings as well as CPU time are very promising. # 2 Terminology Let $\mathcal{B}$ denote the Boolean set $\{0, 1\}$ . A k-dimensional Boolean vector $\mathbf{x} = [x_1, \dots, x_k]^T$ is an element of the set $\mathcal{B}^k$ (bold-facing is hereafter used to denote vector quantities. In particular, the symbol 1 denotes a vector whose components are all 1). A $n_i$ -input, $n_o$ -output Boolean function F is a mapping F: $\mathcal{B}^{n_i} \to \mathcal{B}^{n_o}$ . We use x to denote the set of primary inputs, and F to denote the set of primary output functions. A **literal function**, or **literal**, is the function expressed by a variable or its complement. A **cube** c is the product of some literals. A logic network is a collection of local single-output functions called **gates**. The set of local functions is denoted by y(x) where $y_i$ is the variable associated with the output of each gate $g_i$ , and in general can be expressed as a function of primary inputs. Figure 1 illustrates the relationship of various terms. The **cofactors** (or **residues**) of a function F with respect to a variable $x_i$ are the functions Figure 1: Example of a Logic Network. $\mathbf{F}_{x_i} = \mathbf{F}(x_1, \dots, x_i = 1, \dots, x_n)$ and $\mathbf{F}_{x_i'} = \mathbf{F}(x_1, \dots, x_i = \mathbf{0}, \dots, x_n)$ . The universal quantification or **consensus** of a function $\mathbf{F}$ with respect to a variable $x_i$ is the function $\forall_{x_i} \mathbf{F} = \mathbf{F}_{x_i} \mathbf{F}_{x_i'}$ . The **existential quantification** or **smoothing** of a function $\mathbf{F}$ with respect to $x_i$ is defined as $\exists_{x_i} \mathbf{F} = \mathbf{F}_{x_i'} + \mathbf{F}_{x_i'}$ . A scalar function $F_1$ **contains** $F_2$ (denoted by $F_1 \geq F_2$ ) if $F_2 = 1$ implies $F_1 = 1$ . The containment relation holds for two vector functions if it holds component-wise. A function **F** is termed **positive unate** in $x_i$ if $\mathbf{F}_{x_i} \geq \mathbf{F}_{x_i'}$ , and **negative unate** if $\mathbf{F}_{x_i} \leq \mathbf{F}_{x_i'}$ . Otherwise the function is termed **binate** in $x_i$ . A function **F** positive (negative) unate in a variable $x_i$ can always be expressed without using the literal $x_i'$ (xi) [13]. The desired terminal behavior of a combinational network is **specified** by two functions, ON(x) and DC(x), the latter in particular representing the input combinations that either do not occur or such that the value of some of the network outputs is regarded as irrelevant [11]. The functions ON and **DC** identify the set of possible terminal behaviors for the network: specifications are met by an implementation, realizing a function $\mathbf{F}(\mathbf{x})$ if and only if $\mathbf{F}(\mathbf{x}) = \mathbf{ON}(\mathbf{x})$ for every input x not in **DC**. Another, equivalent, description of the set of terminal behaviors is in terms of the functions $\mathbf{F}_{\min} = \mathrm{ON}$ . **DC'** and $\mathbf{F}_{max} = \mathrm{ON} + \mathbf{DC}$ . Specifications are met by $\mathbf{F}$ if $$\mathbf{F}_{min} \le \mathbf{F} \le \mathbf{F}_{max} \tag{1}$$ We consider hereafter specifications directly in terms of a pair $\mathbf{F}_{min}$ , $\mathbf{F}_{max}$ . #### 2.1 Previous Work Most Boolean methods for multiple-level logic synthesis rely upon two-level synthesis engines. For this reason and in order to establish some essential terminology, we first review some basic concepts of two-level synthesis. #### Two-level Synthesis Consider the synthesis of a (single-output) network whose output y is to satisfy Eq. (1), imposing a realization of y as a sum of cubes $c_k$ : $$\mathbf{F}_{min} \le Y = \sum_{k=1}^{N} c_k \le F_{max} \tag{2}$$ The upper bound in Eq. (2) **b** k difan only if each cube $c_k$ satisfies the inequality $$c_k \le F_{max} \tag{3}$$ Any such cube is termed an **implicant.** An implicant is termed **prime** if no Literal can be removed from it without violating the inequality (3). For the purpose of logic optimization, only prime implicants need be considered [13, 7]. Each implicant $c_k$ has an associated **cost** $w_k$ , which depends on the technology under consideration. For example, in PLA minimization all implicants take the same area, and therefore have identical cost; in a multiple-level context, the number of literals can be taken as cost measure [9]. The cost of a sum of implicants is usually taken as the sum of the individual costs. Once the list of primes has been built, a minimum-cost cover of $\mathbf{F}_{min}$ is determined by solving: minimize: $$\sum_{k=1}^{N} \alpha_k w_k$$ ; subject to: $F_{min} \leq \sum_{k=1}^{N} \alpha_k c_k$ (4) where the Boolean parameters $\alpha_k$ are used in this context to **parameterize** the search space: they are set to 1 if $c_k$ appears in the cover, and to 0 otherwise. The approach is extended easily to the synthesis of multiple-output circuits by defining **multiple-output primes** [13, 7]. A multiple-output prime is a prime of the product of some components of $\mathbf{F}_{max}$ . These components are termed the **influence set** of the prime. Branch-and-bound methods can be used to solve exactly the covering problem. Engineering solutions have been thoroughly analyzed, for example, in [7], and have made two-level synthesis feasible for very large problems. Eq. (4) can be rewritten as $$\forall_{x_1,\dots,x_n} \left( \sum_{k=1}^N \alpha_k c_k(\mathbf{x}) + F'_{min}(\mathbf{x}) \right) = 1$$ (5) The left-hand side of Eq. (5) represents a Boolean function $F_{\alpha}$ of the parameters $\alpha_i$ only; the constraint equation (4) is therefore equivalent to $$F_{\alpha} = \mathbf{1} \tag{6}$$ The conversion of Eq. (4) into Eq. (6) is known in the literature as **Petrick's method** [13]. Two properties of two-level synthesis are worth remarking in the context of this paper. First, once the list of primes has been built, we are guaranteed that no solution will violate the upper bound in Eq. (1), so that only the lower bound needs to be considered (as explicited by Eq. (4)). Similarly, only the upper bound needs to be considered during the extraction of primes. Second, the effect of adding/removing a cube from a partial cover of $\mathbf{F}_{min}$ is always predictable: that partial cover is increased/decreased. This property eases the problem of sifting the primes during the covering step, and it is reflected by the unateness of $F_{\alpha}$ : intuitively, by switching any parameter $\alpha_i$ from 0 to 1, we cannot decrease our chances of satisfying Eq. (6). These are important attributes of the problem that need to be preserved in its generalizations. #### Don't care -based Multiple-level Optimization Two-level optimization is the basic engine in **don't care** -based multiple-level logic optimization, where it is used to iteratively optimize single-output gates in the network. Consider a single-output subnetwork, with local output y, to be re-synthesized. The primary output F of the overall network can be expressed in terms of the signal y: $$\mathbf{F} = \mathbf{F}(\mathbf{x}, y) = y' \mathbf{F}_{y'} t y \mathbf{F}_{y} = (y\mathbf{1} t \mathbf{F}_{y'})(y'\mathbf{1} + \mathbf{F}_{y})$$ $$(7)$$ By replacing Eq. (7) in Eq. (1), it follows that y must satisfy: $$F_{min} \le y' F_{y't} y F_y \le F_{max}$$ (8) A constraint on y similar to Eq. (1) can be obtained from Eq. (8) as follows. The upper bound in Eq. (8) holds if and only if $y'\mathbf{F}_{y'} \leq \mathbf{F}_{max}$ and $y\mathbf{F}_{y} \leq \mathbf{F}_{max}$ , i.e. $$y' \le \mathbf{F}_{max} + \mathbf{F}'_{y'}; \quad y \le \mathbf{F}_{max} + \mathbf{F}'_{y} \tag{9}$$ Eq. (9) can be rewritten as $$\mathbf{F}'_{max}\mathbf{F}_{y'} \le y\mathbf{1} \le \mathbf{F}_{max \ t} \ \mathbf{F}'_{y} \tag{10}$$ Similarly, the lower bound holds if and only if $\mathbf{F}_{y'}+$ yl $\geq \mathbf{F}_{min}$ and $\mathbf{F}_{y}+$ y'l $\geq \mathbf{F}_{min}$ , i.e. $$\mathbf{F}_{min}\mathbf{F}_{y'} \le y\mathbf{1} \le \mathbf{F}'_{min} + \mathbf{F}_{y} \tag{11}$$ Eq. (10) and (11) can be merged together, to obtain: $$\mathbf{F}_{min}\mathbf{F}'_{y'} + \mathbf{F}'_{max}\mathbf{F}_{y'} \le y\mathbf{1} \le (\mathbf{F}_{max} + \mathbf{F}'_{y})(\mathbf{F}'_{min \ t} \mathbf{F}_{y})$$ (12) Eq. (12) represents the exact degrees of freedom available in the synthesis of the signal y, and is formally identical to Eq. (1): the value of y is undetermined corresponding to those points for which the lower bound differs from the upper bound. Such points are the local **don't cares** for y, and Figure 2: Boolean Relations Optimization Example. are denoted by $DC_{,(x)}$ . 0 nce the bounds (or, equivalently, the **don't cares**) for y are computed, ordinary two-level synthesis algorithms can be applied.' #### **Boolean Relations-based Multiple-level Optimization** **Don't care** -based methods allow the optimization of only one single-output subnetwork at a time. It has been shown in [12] that this strategy may potentially produce lower-quality results with respect to a more general approach attempting the simultaneous optimization of multiple-output subnetworks. Let $\mathbf{y} = [y_1, y_2, \dots, y_m]$ denote the outputs of a subnetwork, to be re-synthesized, and let $\mathbf{F}(\mathbf{x}, \mathbf{y})$ denote the network outputs, expressed in terms of the variables $y_i$ . From equation(1), the functional constraints on $\mathbf{y}$ are expressed by $$\mathbf{F}_{min} \le \mathbf{F}(\mathbf{x}, \mathbf{y}) \le \mathbf{F}_{max} \tag{13}$$ An equation like Eq. (13) describes a **Boolean Relation**<sup>2</sup>. The synthesis problem consists of finding a minimum-cost realization of $y_1, \ldots, y_m$ such that Eq. (13) holds. An exact solution algorithm, targeting two-level realizations, is presented in [12]. We illustrate the additional difficulties of the covering step with respect to the ordinary two-level synthesis process by means of the following example. Example 1 Consider the optimization of gates $g_1$ and $g_2$ , with outputs $y_1$ and $y_2$ , in the circuit of Figure 2. Assuming no external don't care conditions, $F_{min} = F_{max} = a'b' + (ac + bd) \oplus (a'c' + a'b')$ , while $F = y_1 \oplus y_2 + a'b'$ . Eq. (13) then takes the form: $$a'b' + (ac + bd) \oplus (a'c' + a'b') \le y_1 \oplus y_2 + a'b'$$ $\le a'b' + (ac + bd) \oplus (a'c' + a'b')$ <sup>&</sup>lt;sup>1</sup>In practice, y is re-synthesized by taking advantage also of the other internal signals available in the network. Implicants and primes are in this context expressed in terms of primary inputs and other network variables. <sup>&</sup>lt;sup>2</sup>An alternative formulation of a Boolean Relation is by means of a characteristic equation: $R(\mathbf{x}, \mathbf{y}) = 1$ , where R is a Boolean function. It could be shown that the two formulations are equivalent. By the symmetry of the network with respect to $y_1$ and $y_2$ , cubes a'c', ac, bd, a'b' would be listed as implicants for both $y_1$ and $y_2$ . Consider constructing now a cover for $y_1$ and $y_2$ from such implicants. An initial partial cover, for example obtained by requiring the cover of the minterm abcd of $F_{min}$ , may consist of the cube ac assigned to $y_1$ . Consider now adding bd to $y_2$ , in order to cover the minterm abc'd of $F_{min}$ . Corresponding to the minterm abcd, now $y_1 \oplus y_2 = 0$ while $F_{min} = 1$ ; that is, the lower bound of Eq. (13) is violated. Similarly, with the input assignment a = 0, b = 1, c = 0, d = 1, the network output changed from the correct value 0 to 1, while $F_{max} = 0$ . Thus, also the upper bound is violated. Contrary to the case of unate covering problems, where the addition of an implicant to a partial cover can never cause the violation of any junctional constraints, here the addition of a single cube has caused the violation of both bounds in Eq. (13). $\Box$ There are two difficulties in solving a Boolean relation: First, when trying to express Eq. (13) in a form similar to Eq. (12), the upper and lower bounds on each $y_i$ may depend on other variables $y_j$ . This results in a binate covering step. Fast binate covering solvers are the subject of ongoing research [14]; nevertheless, the binate nature of the covering step reflects an intrinsic complexity which is not found in the unate case. In particular, as shown in the previous example, the effect of adding / removing a prime to a partial solution is no longer trivially predictable, and both bounds in Eq. (13) may be violated by the addition of a single cube. As a consequence, branch-and-bound solvers may (and usually do) undergo many more backtracks than with a unate problem of comparable size, resulting in a substantially increased CPU time. # 3 Compatible Gates The analysis of Boolean relations points out that binate problems arise because of the generally binate dependence of $\mathbf{F}$ on the variables $y_i$ . We introduce the notion of **compatible gates** in order to perform multiple-gate optimization while avoiding the binate covering problem. In the rest of the paper, given a network output expression $\mathbf{F}(\mathbf{x}, \mathbf{y})$ , $\mathbf{x}$ is the set of input variables and $\mathbf{y}$ is the set of gate outputs to be optimized. This relationship is shown in Figure 3. Definition 1 In a Boolean network, let $\mathbf{p}_j = p_j(x_1, \ldots, x_n)$ and $\mathbf{q} = q(x_1, \ldots, x_n)$ , where $j = 1, 2, \ldots, m$ , be junctions that do not depend on $y_1, \ldots, y_m$ . A subset of gates $\mathbf{S} = \{g_1, \ldots, g_m\}$ with outputs $y_1, \ldots, y_m$ and junctions is said to be compatible if the network input-output behavior $\mathbf{F}$ can be expressed as: $$\mathbf{F} = \sum_{j=1}^{m} y_j \mathbf{p}_j + \mathbf{q} \tag{14}$$ modulo a phase change in the variables $y_i$ or $\mathbf{F}$ . Figure 3: Network with Selected Gates Figure 4: Gates $g_1$ and $g_2$ are compatible. As shown in Sect. (3.1) below, compatible gates can be optimized jointly without solving binate covering problems. Intuitively, compatible gates are selected such that their optimization can only affect the outputs in a monotonic or unate way, and thereby forcing the covering problem to be unate. # Example 2 Consider the two-output circuit in Figure 4. Gates $g_1$ and $g_2$ are compatible because F and H can be written as $$F = (x_1 + x_3 t x_4')y_1 + (x_1 + x_2' + x_3)y_2$$ $$H = 0y_{1:t} 0y_{2:t} ((x_{1:t:x3} + x_4')(x_1 + x_2' + x_3))'$$ The compatibility of a set S of gates is a Boolean property. In order to ascertain it, one would have to verify that all network outputs can indeed be expressed as in Definition (3). This task is potentially very CPU-intensive. In Section 4, we present algorithms for constructing subsets of compatible gates from the network topology only. #### 3.1 Optimizing Compatible Gates The functional constraints for a set of compatible gates can be obtained by replacing Eq. (14) into Eq. (13). From Eq. (14) we obtain: $$\mathbf{F}_{min} \le \sum_{j=1}^{m} y_j \mathbf{p}_j + \mathbf{q} \le \mathbf{F}_{max} \tag{15}$$ Eq. (15) can be solved using steps similar to that of two-level optimization. In particular, the optimization steps **consist** of **implicant** extraction and the **covering** steps. #### Implicant Extraction Assuming that $\mathbf{q} \leq \mathbf{F}_{max}$ , the upper bound of Eq. (15) hald if nothly if for each product $y_j \mathbf{p}_j$ the inequality $$y_j \mathbf{p}_j \leq \mathbf{F}_{max}$$ is verified, i.e. if and only if $$y_j \mathbf{1} \le \mathbf{F}_{max} + \mathbf{p}_j'; \ \mathbf{j} = 1, \dots, \mathbf{m}$$ (16) or, equivalently, $$y_j \le F_{max,j}; \quad j = 1, \dots, m \tag{17}$$ where $F_{max,j}$ is the product of all the components of $\mathbf{F}_{max} + \mathbf{p}'_j$ . A cube c can thus appear in a two-level expression of $y_j$ if and only if $c \leq F_{max,j}$ . As this constraint is identical to Eq. (3), the prime-extraction strategies [13, 7] of ordinary two-level synthesis can be used. Example 3 Consider the optimization problem for gates $g_1$ and $g_2$ in Fig. (4). From Example (1), $$Pl = (x_{1t} \ x_{3t} \ x_4')'$$ $$P2 = (x_1 + x_2' + x_3)'$$ We assume no external don't care set. Consequently, $F_{min} = F_{max} = x_1x_2x_3' + x_2x_3x_4 + x_1'x_2'(x_3 + x_4')$ . The Karnaugh maps of $F_{min}$ and $F_{max}$ are shown in Fig. (5a), along with those of $p_1$ and $p_2$ . Fig. (5b) shows the maps of $F_{max,1} = F_{max} + p_1'$ and $F_{max,2} = F_{max} + p_2'$ , used for the extraction of the primes of $y_1$ and $y_2$ , respectively. The list of all multiple-output primes is given in Table (1). Note that primes 1 through 5 can be used by both $y_1$ and $y_2$ . $\square$ | | Primes | Influence sets | |-------|----------------|-------------------------------------| | $c_1$ | $x_1'x_2'x_3$ | $y_1, y_2$ | | $c_2$ | $x_1'x_2'x_4'$ | $y_1, y_2$ | | $c_3$ | $x_1'x_3x_4$ | $y_1, y_2$ | | $c_4$ | $x_2x_4$ | $y_1, y_2$ | | $c_5$ | $x_1x_2x_3'$ | $oldsymbol{y}_1$ , $oldsymbol{y}_2$ | | $c_6$ | $x_2x_3'$ | Y2 | | $c_7$ | $x_1'x_3'x_4'$ | | | $c_8$ | $x_1'x_2'$ | <u>y</u> 1 | | $c_9$ | $x_1'x_4$ | $y_1$ | Table 1: Multiple-output primes for Example (3). #### **Covering Step** Let N indicate the number of primes. For example, in the problem of Example (3), N = 9. We then impose a sum-of-products representation associated with each variable $y_i$ : $$y_j = \sum_{k=1}^N \alpha_{jk} c_k \tag{18}$$ with the only restriction that $\alpha_{jk} = 0$ if $y_j$ is not in the influence set of $c_k$ . Since the upper bound of Eq. (15) is now satisfied by construction (i.e. by implicant computation), the minimization of $y_1, \ldots, y_m$ can be formulated as a minimum-cost covering problem $$\mathbf{F}_{min} \le \mathbf{q} + \sum_{j=1}^{m} \sum_{k=1}^{N} \alpha_{jk} c_k \mathbf{p}_j \tag{19}$$ whose similarity with Eq. (4) is evident, the products $c_{jk}$ $\mathbf{p}_j$ now playing the role of the primes of two-level synthesis. Example 4 In the optimization problem of Example (3), we are to solve the covering problem $$F_{\min} \leq p_1 y_1 + p_2 y_2$$ Using the set of primes found in Example (3), $y_1$ and $y_2$ are expressed by $$y_1 = \alpha_{1,1}c_{1+} \quad \alpha_{1,2}c_{2+} \quad \alpha_{1,3}c_{3+} \quad \alpha_{1,4}c_{4+} \quad \alpha_{1,5}c_5 + \\ \alpha_{1,8}c_{8+} \quad \alpha_{1,9}c_9$$ $$y_{2}$$ = $\alpha_{2,1}c_{1}$ + $\alpha_{2,2}c_{2}$ + $\alpha_{2,3}c_{3}$ + $\alpha_{2,4}c_{4}$ + $\alpha_{2,5}c_{5}$ + $\alpha_{2,6}c_{6}$ + $\alpha_{2,7}c_{7}$ Figure 5: (a): Maps of $F_{min}$ , $F_{max}$ , $p_1$ , $p_2$ . (b) Maps of $F_{max,1}$ , $F_{max,2}$ and of the product $F_{max,1}F_{max,2}$ . Primes of $y_1$ and $y_2$ are shown in the maps of $F_{max,1}$ and $F_{max,2}$ , respectively. The map of $F_{max,1}F_{max,2}$ shows the primes common to $y_1$ and $y_2$ . Figure 6: A minimum-cost solution for the covering of $\mathbf{F}_{min}$ . The optimum solution has cost 6 and is given by $y_1 = x_1'x_2' + x_2x_4$ ; $y_2 = x_2x_3'$ , corresponding to the assignments $$\alpha_{1,1} = \alpha_{1,2} = \alpha_{1,3} = \alpha_{1,5} = \alpha_{1,9} = 0; \ \alpha_{1,4} = \alpha_{1,8} = 1$$ $$\alpha_{2,1} = \alpha_{2,2} = \alpha_{2,3} = \alpha_{2,4} = \alpha_{2,5} = \alpha_{2,7} = 0; \ \alpha_{2,6} = 1$$ The initial cost, in terms of literals, was 12. The solution corresponds to the cover shown in Fig. (6), and resulting in the circuit of Fig. (7). $\Box$ It is worth contrasting, in the above example, the role of $y_1$ and $y_2$ in covering $F_{min}$ . Before optimization, $p_1y_1$ covered the minterms $x_1x_2x_3'x_4'$ , $x_1x_2x_3'x_4$ , $x_1x_2x_3x_4$ of $F_{min}$ , while $p_2y_2$ covered Figure 7: Network resulting from the simultaneous optimization of compatible gates $g_1$ and $g_2$ . Figure 8: **Don't care** conditions associated with $y_1$ and $y_2$ : only 1 literal can be removed. $x_1'x_2'x_3'x_4'$ , $x_1'x_2'x_3x_4'$ , $x_1'x_2x_3x_4$ , $x_1'x_2'x_3x_4$ . After optimization, $y_1$ and $y_2$ essentially "switched role" in the cover: $p_2$ $y_2$ is now used for covering $x_1$ $x_2$ $x_3'$ $x_4'$ , $x_1$ $x_2$ $x_3'$ $x_4$ , while $p_1$ $y_1$ covers all other minterms. In the general case, the possibility for any of $y_1, \ldots, y_m$ to cover a minterm of $F_{min}$ is evident from Eq. (15). Standard single-gate optimization methods based on **don't cares** [11] regard the optimization of each gate $g_1, \ldots, g_m$ as separate problems, and therefore this degree of freedom is not used. For example, in the circuit of Fig. (4), the optimization of $g_1$ is distinct from that of $g_2$ . The **don't care** conditions associated to (say) $y_1$ are those minterms for which either $p_1 = 0$ or such that $p_2y_2 = 1$ , and are shown in the map of Fig. (8), along with the initial cover. It can immediately be verified that $y_1$ can only be optimized into $x_1x_2x_3'$ •1- $x_2x_4$ , saving only one literal. The **don't cares** for $y_2$ are also shown in Fig. (8). No optimization is possible in this case. Note also that the optimization result is (in this particular example) independent from the order in which $g_1$ and $g_2$ are optimized. Unlike the compatible gates case, it is impossible for the covers of $y_1$ and $y_2$ to "switch" role in covering $F_{min}$ . # **4 Finding Compatible Gates** In this section, we describe an algorithm for finding compatible gates based on network topology. Figure 9: Internally Unate Example: (a) Network not internally unate due to gate $g_i$ ; (b) Internally unate network after duplication (Duplicated gates are shaded). Definition 2 A network is termed unate with respect to a gate g if all reconvergent paths from g have the same parity of inversions. A network is internally unate if it is unate with respect to each of its gates. All paths from g to a primary output $z_i$ in an internally unate network have parity $\pi_i$ , which is defined to be the parity of g with respect to $z_i$ . In the subsequent analysis, we make the assumption that the network is first transformed into its equivalent NOR-only form. In this case, the parity of a path is simply the parity of the path length. In defining Equation (14) for compatible gates, it is evident that the dependency of $\mathbf{F}$ on $y_1, \ldots, y_m$ must be unate. In order to increase the chances of finding sets of compatible gates, it is thus convenient to transform a network into an internally unate one. This is done by duplicating those gates whose fanouts contain reconvergent paths with different inversion parity. The resulting network is therefore at most twice the size of the original one. In practice, the increase is smaller. Example 5 Given a logic network shown in Figure (9.a). The network is not internally unate because the reconvergent paths from gate $g_i$ to the output y do not have the same parity of inversions. We duplicate gate $g_i$ and its fan-in cone into $g'_i$ , shown by the shaded gates in Figure (9.b). Now gates $g_i$ and $g'_i$ are unate since there are no reconvergent paths from these gates. The network is now internally unate. The increase in size is in the number of gates in the fan-in cone of gate $g_i$ . Theorem (4.1) below provides a sufficient conditions for a set S of gates to be compatible. Without loss of generality, the theorem is stated in terms of networks with one primary output. The following auxiliary definitions are required: Definition 3 The famout gate set and famout edge set of a gate g, indicated by FO(g) and FOE(g), respectively, are the set of gates and interconnections contained in at least one path from g to the primary output. The fanout gate set and fanout edge set of a set of gates $S = \{g_1, \ldots, g_m\}$ , indicated by FO(S) and FOE(S), respectively, are: $$FO(S) = \bigcup_{i=1}^{m} FO(g_i); \quad FOE(S) = \bigcup_{i=1}^{m} FOE(g_i)$$ (20) Theorem 4.1 In a NO R-only, single-output network, let $S = \{g_1, \ldots, g_m\}$ be a set of gates all with parity $\pi$ , and not in each others' fanout. Let $y_1, \ldots, y_m$ denote their respective outputs. Assume that for all gates in FO(S) with parity $\pi$ has at most one interconnection in FOE(S), The outputs of all gates in the network can then be expressed by equations in the form of Eq. (14): $$\mathbf{F} = \sum_{j=1}^{m} y_j \mathbf{p}_j + \mathbf{q}$$ Namely, the output of each gate g in the network can be expressed by one of the following two rules: **Rule** 1: for gates of even parity, $$g = q_g + \sum_{j=1}^{m} p_{jg} y_j \tag{21}$$ Rule 2: for gates of odd parity, $$g = \left(q_g + \sum_{i=1}^{m} p_{ig} y_i\right)' \tag{22}$$ Consequently, S is a set of compatible gates, #### **Proof:** Assume the network gates to be sorted topologically, so that each gate precedes its fanout gates in the ordered list. Let **NGATES** denote the total number of gates. We prove the above proposition inductively, by showing that if it holds for the first r-1 gates, then it must hold for the $r^{th}$ gate, $r=1,\ldots,NGATES$ . Consider the first gate, $g_1$ . If $g_1 \in S$ , its output is simply $y_1$ , which can be obtained from Eq. (21), by setting $q_{g_1} = 0, p_{1g_1} = 1, p_{jg_1} = 0; j = 2, \ldots, m$ . If $g_1$ does not belong to S, by the properties of topological ordering, its inputs can only be among the primary inputs, and consequently its output is still expressed by Eq. (21), by setting $p_{jg_1} = 0$ . Consider now the $r^{th}$ gate, g., Again, if g., $\in$ S, the output is expressed by a single variable in $\{y_1, \ldots, y_m\}$ , and therefore it satisfies the proposition. If $g_r$ does not belong to $\mathcal{S}$ , we note that all its inputs are either primary inputs or gates $g_{r'}$ , r' < r, for which the proposition is true by the inductive assumption. We distinguish two cases: 1. g., is of even parity. Consequently, all its inputs have odd parity, and only one of them is a function of the internal variables $y_i$ . For simplicity, let $g_0$ denote the output that (possibly) depends on $y_1, \ldots, y_m$ . The output of g., is then expressed by $$\left( (q_{g_0} + \sum_{j=1}^m p_{jg_0} y_j)' + \sum_{g_i \in FI(g_r)} q_{g_i} \right)' = q_{g_r} + \sum_{j=1}^m p_{jg_r} y_j$$ where $$q_{g_r} = q_{g_0} \prod_{g_i \in FI(g_r)} q'_{g_i}$$ ; $p_{jg_r} = p_{jg_0} \prod_{g_i \in FI(g_r)} q'_{g_i}$ 2. g,. is of odd parity, and consequently all its inputs are from gates of even parity and are expressed by Eq. (21); therefore the output of $g_r$ is expressed by $$\left(\sum_{g_i \in FI(g_r)} (q_{g_i} + \sum_{j=1}^m p_{jg_i} y_j)\right)' = \left(q_{g_r} + \sum_{j=1}^m p_{jg_r} y_j\right)'$$ where $$q_{g_r} = \sum_{g_i \in FI(g_r)} q_{g_i} \; ; \; \; p_{jg_r} = \sum_{g_i \in FI(g_r)} p_{jg_i}$$ By induction, the output of each gate (in particular, each primary output) is expressed by Eq. (21) or (22); therefore, the gates in S are compatible. It can also be verified that in a multiple-output network, the gates of a set S are compatible if and only if Property 1) or 2) of Theorem (4.1) hold with respect to each output. Example 6 In the internally unate, NOR-only network of Figure 10, consider the set $S = \{g_1, g_2, g_4\}$ . All gates of S are ojodd parity and not in each other's fanout. Moreover, $FO(S) = \{g_5, g_7, g_8, g_9, g_{10}, g_{11}, g_{12}\}$ and for all gates in FO(S) of odd parity (namely, $g_8, g_9, g_{10}$ ), there is only one input interconnection that belongs to FOE(S). S then represents a compatible set by rule (1) of Theorem (4.1). Similarly, the set $S = \{g_3, g_4\}$ is compatible by rule (1), as in this case $FO(S) = \{g_6, g_7, g_{10}, g_{12}\}$ , and the gates of FO(S) with even parity (namely, $g_6$ and $g_7$ ) have only one input interconnection in FOE(S). Other compatible sets are, for example, $\{g_1, g_{10}\}$ (by rule (1)) and $\{g_5, g_7\}$ (by rule (2)). It is worth noting that some gates (in this case, $g_4$ ) can appear in more compatible sets. $\Box$ Figure 10: Example of Compatible Gates. Theorem (4.1) also provides a technique for constructing a set of compatible gates directly from the network topology, starting from a "seed" gate g and a parameter *(rule)* that specifies the desired criterion of Theorem (4.1) (either 1 or 2) to be checked during the construction. The algorithm is as follows: ``` COMPATIBLES(g, rule) label-f \ anout(g, FO); S = \{g\}; for(i = 0; i \leq NGATES; i + +) \{ if \ ((is\_labeled(g_i) = FALSE) \ \& \ (parity(g_i) = parity(g))) \{ label-f \ anout(g_i, TMP); compatible = df \ s\_check(g_i, parity(g), rule) if(\ compatible) \{ label-f \ anout(g_i, FO); s = s \ u \ \{g_i\}; \} ``` **COMPATIBLES** starts by labeling "FO" the fanout cone of g, as no gates in that cone can belong to a compatible set containing g. Labeled gates represents elements of the set FO(S). All gates $g_i$ that are not yet labeled and have the correct parity are then examined for insertion in S. To this purpose, the fanout of $g_i$ that is not already in FO(S) is temporarily labeled "TMP", and then visited by **dfs,check** in order to check the satisfaction of **rule**. The procedure **dfs,check** performs a depth-first traversal on gate $g_i$ . The traversal returns 0 whenever gates already in FO(S) are reached, or a violation of **rule** is detected. Otherwise, if the traversal reaches the primary outputs, then 1 is returned indicating that $g_i$ is compatible. If $g_i$ is compatible, it becomes part of S and its fanout is merged with FO(S). Example 7 Refer again to Figure(10) for this example. Consider constructing a set of compatible gates around $g_1$ , using rule (1). Gates $g_5$ , $g_8$ , $g_9$ , $g_{11}$ , $g_{12}$ are labeled first, because they belong to $FO(g_1)$ . The fs unlabeled gate is therefore $g_2$ . The depth-first scan of its fanout reaches $g_5$ first, which has parity opposite to $g_1$ . The check of the fanin of $g_5$ is therefore not needed. Gates $g_7$ and $g_{10}$ are then reached. In particular, since $g_{10}$ has the same parity as $g_1$ , its fanin is checked to verify that there is indeed only one interconnection (in this case, $(g_7, g_{10})$ ) to gates in S. df s-check returns in this case a value TRUE for the compatibility of $g_2$ to $g_1$ . $\square$ ### 5 Unate Optimization In the previous section we showed that in the case of compatible gates, the functional constraints expressed by Eq. (13) can be reduced to an upper bound (expressed by Eq. (17)) on the individual variables $y_i$ and by a global covering constraint, expressed by Eq. (19). These could be solved by a two-step procedure similar to that of two-level optimization. We now generalize this result to the optimization of arbitrary subsets S of unate gates. #### 5.1 Optimizing Unate Subsets Assume, for the sake of simplicity, that **F** is positive unate with respect to $\{y_1, \ldots, y_n\}$ . We can perform optimization on the subset of unate gates in a style that is totally analogous to compatible gates by dividing it into **implicant extraction** and **covering** steps. #### Implicant Extraction In this step, for each $y_i$ to be optimized, a set of **maximal functions** is extracted. In particular, the maximal functions of each each $y_i$ can be expressed as Eq. (23), which is similar to Eq. (17). $$y_i \le G_{max,j}; j = 1, \dots, m \tag{23}$$ From Eq. (23), appropriate implicants can then be extracted. Intuitively, the maximal functions are the largest functions that can be used while satisfying the bound $\mathbf{F} \leq \mathbf{F}_{max}$ . Therefore, they represent the upper bounds on $y_i$ . We introduce the following definition: #### **Definition 4 A set** of local functions $$\{G_{max,1}(\mathbf{x}), G_{max,2}(\mathbf{x}), \dots, G_{max,m}(\mathbf{x})\}$$ is said to be maximal if $$\mathbf{F}(\mathbf{x}, G_{max,1}(\mathbf{x}), G_{max,2}(\mathbf{x}), \dots, G_{max,m}(\mathbf{x})) \le \mathbf{F}_{max}(\mathbf{x})$$ $\forall \mathbf{x}$ (24) and the inequality (24) is violated when any $G_{max,j}$ is replaced by a larger function $\tilde{F} > G_{max,j}$ . The idea behind the notion of maximal functions is that by substituting each $y_j$ by any function $\phi_j \leq F_{max,j}$ , we are guaranteed that the upper bound $$\mathbf{F}(\mathbf{x}, \phi_1(\mathbf{x}), \dots, \phi_m(\mathbf{x})) \le \mathbf{F}_{max}(\mathbf{x}) \tag{25}$$ will not be violated. The conditions $$y_i \leq G_{max,i}$$ therefore represent sufficient conditions for this bound to hold. The following theorem provides means for finding a set of maximal functions. It also shows that computing such functions has complexity comparable with computing ordinary **don't care** sets. Theorem 5.1 Let us consider a network with a totally ordered set of gates. Let $S = \{g_1, \ldots, g_m\}$ be a set of unate gates. The set of maximal functions, as defined by Eq. (24), with respect to a set of unate gates S can be obtained by: $$\mathbf{G}_{max,j} = y_j + DC_j \tag{26}$$ where $y_j$ denotes the output function of $g_j$ in the unoptimized network. $DC_j$ represents the don't care set associated with $g_j$ calculated with the following rule: the output functions for gates $g_1, \ldots, g_{j-1}$ are set to $G_{max,k}$ , $\mathbf{k} = 1, \ldots, \mathbf{j} - 1$ , and the output functions for gates $g_j, \ldots, g_m$ , are set to $y_k$ ; $\mathbf{k} = j, \ldots, m$ . **Proof:** The proof is divided into two parts. First, it is shown that the bounds $G_{max,j} = y_j + DC_j$ satisfy Eq. (24). It is then shown, by contradiction, that these bounds are indeed maximal. To prove the first part, suppose that maximal functions for the variables $y_1, \ldots, y_{j-1}$ have already been computed. They are such that $$\mathbf{F}(\mathbf{x}, G_{max,1}, \dots, G_{max,j-1}, y_i, y_{j+1}, \dots, y_m) \leq \mathbf{F}_{max}$$ The constraint equation on $y_i$ can then be expressed by: $$\mathbf{F}_{min} \leq \mathbf{F}(\mathbf{x}, G_{max,1}, \dots, G_{max,j-1}, \tilde{y}_j, y_{j+1}, \dots, y_m) \leq \mathbf{F}_{max}$$ where $ilde{y}_j$ satisfies $$y_j$$ . $DC'_j \leq \tilde{y}_j \leq y_j + DC_j$ where $DC_j$ is the **don't care** set associated to $y_j$ , under the theorem's assumptions. It is then guaranteed that $$\mathbf{F}(\mathbf{x}, G_{max,1}, \dots, G_{max,j-1}, G_{max,j}, y_{j+1}, \dots, y_m) \leq \mathbf{F}_{max}$$ for $j = 1, \ldots, m$ . To prove maximality, it is sufficient to show that $G_{max,j}$ cannot be replaced by any larger function. Suppose, by contradiction, that a different bound $\tilde{\mathbf{F}}$ can be used, such that for some input combination $\mathbf{x}_0$ we have $G_{max,j}(\mathbf{x}_0) = 0$ but $\tilde{\mathbf{F}}_j(\mathbf{x}_0) = 1$ . Notice that $G_{max,j}(\mathbf{x}_0) = \mathbf{0}$ implies that $y_j(\mathbf{x}_0) = 0$ and $DC(\mathbf{x}_0) = 0$ . Corresponding to $\mathbf{x}_0$ , it must then be $$\mathbf{F}(\mathbf{x}_0, G_{max,1}(\mathbf{x}_0), \dots, G_{max,j-1}(\mathbf{x}_0), y_j(\mathbf{x}_0), \dots y_m(\mathbf{x}_0)) = 0$$ $$\mathbf{F}(\mathbf{x}_0, G_{max,1}(\mathbf{x}_0), \dots, G_{max,j-1}(\mathbf{x}_0), 1, \dots, y_m(\mathbf{x}_0)) = 1$$ and $$\mathbf{F}_{max}(\mathbf{x}_0) = 0$$ (or otherwise a change in the value of $y_j$ could not affect the $\mathbf{F}$ and result in $DC_j(\mathbf{x}_0) = \mathbf{0.0}$ If a larger bound $\tilde{\mathbf{F}}_j$ could be used, this would mean that we could replace $y_j$ by a function $\phi_j$ such that, in particular, $\phi_j(\mathbf{x}_0) = 1$ , and all functions $y_1, \ldots, y_{j-1}$ by $G_{max,1}, \ldots, G_{max,j-1}$ . But in this case, we would have $\mathbf{F}(\mathbf{x}_0, G_{max,1}(\mathbf{x}_0), \ldots, G_{max,j-1}(\mathbf{x}_0), 1, \ldots, y_m(\mathbf{x}_0)) = 1$ while $G_{max}(\mathbf{x}_0) = 0$ , violating the specifications. Note that the computation of each maximal function corresponds to finding the local *don't care* for the associated vertex. Therefore, the maximal functions computation has the same complexity as computing the *don't care* conditions for each gate. This theorem states that the maximal function for vertex i depends on the maximal functions already calculated (j < i). This means that unlike the case of compatible gates, the maximal function for a given vertex may be not unique. Example 8 For the network of Fig. (11), assuming no external don't care conditions, we find the maximal functions for $y_1$ , $y_2$ , and $y_3$ . The DC,, terms correspond to the observability don't care at $y_j$ , computed using the $F_{max}$ of the previous gates. $$y_1 = x_1 x_3' x_4; \quad y_2 = x_3' (x_4 + x_2); \quad y_3 = x_3' x_2 + x_1' x_2'$$ Maximal functions derived by Theorem (5.1) are : $$G_{max,1} = x_1 x_3' x_4 + D c$$ , , = $x_3' x_4 + (x_3' + x_4) x_1' x_2'$ $$G_{max,2}$$ = $x'_3(x_4 + x_{21} + DC_{y_2}(y_1 = G_{max,1})$ = $x_4 + x'_3x'_2 + x_1x'_2 + x_{332}$ Figure 11: Network for Example (8). $$G_{max,3} = x_3'x_2' + x_1'x_2 + DC_{y_3}(y_1 = G_{max,1}, y_2 = G_{max,2})$$ $$= x_3'x_2 + x_1'x_2' + x_4x_2'$$ cl #### **Covering Step** Eq. (23) allows us to find a set of multiple-output primes for $y_1, \ldots, y_m$ . The covering step then consists of finding a minimum-cost sum such that the lower bound of Eq. (13) holds. We now present a reduction for transforming the covering step to the one presented for compatible gates. We first illustrate the reduction by means of an example. Example 9 In Fig. (1 1), consider the combination of inputs x resulting in $\mathbf{F}_{min}(\mathbf{x}) = 1$ . To each such combination we can associate the set of values of $y_1, y_2, y_3$ such that $\mathbf{F}(\mathbf{x}, \mathbf{y}) = 1$ . For instance, for the entry $x_1x_2x_3x_4 = 1001$ , it must be $F_{x_1x_2'x_3'x_4}(\mathbf{y}) = y_1 + y_2y_3 = 1$ . Let us now denote with $\mathbf{G}(\mathbf{y})$ the left-hand side of this constraint, i.e. $\mathbf{G}(\mathbf{y}) = y_1 + y_2y_3$ . Notice that $\mathbf{G}(\mathbf{y})$ is unate in each $y_j$ and changes depending on the combination of values currently selected for $x_1, x_2, x_3, x_4$ . Any constraint G(y) = 1 can be represented in a canonical form: $$G(\mathbf{y}) = (G_{y_1'y_2'y_3'} + y_1 + y_2 + y_3)(G_{y_1'y_2'y_3} + y_1 + y_2)$$ $$\dots (G_{y_1y_2y_3'} + y_3)G_{y_1y_2y_3} = 1$$ which, in turn, is equivalent to the 8 constraints $$G_{y_1'y_2'y_3'} + y_1 + y_2 + y_3 = 1$$ $$G_{y_1'y_2'y_3} + y_1 + y_2 = 1$$ ... $$G_{y_1y_2y_3'} + y_3 = 1$$ $$G_{y_1y_2y_3} = 1$$ $$= 1$$ By introducing an auxiliary variable $z_j$ for each $y_j$ , we can rewrite Eq. (27) as: $$G(\mathbf{z}) + z_1' y_1 + z_2' y_2 + z_3' y_3 = 1 \quad \forall z_1, z_2, z_3$$ or, equivalently, $$G'(\mathbf{z}) \le z_1' y_1 + z_2' y_2 + z_3' y_3$$ In this particular example, we get $$(z_1 + z_2 z_3)' \le z_1' y_1 + z_2' y_2 + z_3' y_3$$ Example (9) shows a transformation that converts the covering problem of arbitrary unate gates into a form that is similar to optimization of compatible gates, i.e. Eq. (15). More generally, corresponding to each combination x such that $\mathbf{F}_{min}(\mathbf{x}) = 1$ , the constraint $\mathbf{F}(\mathbf{x}, \mathbf{y}) = 1$ can be re-expressed as $$\mathbf{F}(\mathbf{x}, \mathbf{z}) + z'_1 y_1 + z'_2 y_2 + \cdots + z'_m y_m = 1$$ The resulting covering problem to find the minimum-cost solution is analogous to the compatible gates case. The transformation is formalized in the following theorem: Theorem 5.2 Given F(x), let y be a set of unate gates with respect to F. Let $z = [z_1, \ldots, z_m]$ denote m auxiliary Boolean variables. The lower bound of Eq. (13) holds if and only if $$\mathbf{F}_{min} \leq \mathbf{F}(\mathbf{x}, \mathbf{z}) \, \underset{j=1}{\overset{m}{\leftarrow}} y_j(z_j' \mathbf{1}) \, \forall \mathbf{z}$$ (28) #### **Proof:** We first show by contradiction that $$\mathbf{F}(\mathbf{x}, \mathbf{y}) \le \mathbf{F}(\mathbf{x}, \mathbf{z}) + \sum_{j=1}^{m} y_j(z_j' \mathbf{1})$$ (29) Eq. (29) can be violated only by a combination $\mathbf{x}_0$ , $\mathbf{y}_0$ , $\mathbf{z}_0$ such that one component of $\mathbf{F}(\mathbf{x}_0, \mathbf{y}_0)$ takes value 1, the same component of $\mathbf{F}(\mathbf{x}_0, \mathbf{z}_0)$ takes value 0, and the rightmost term of Eq. (29) takes value zero. In any such combination, there must be at least one value $y_{i,0} = 1$ and $z_{i,0} = 0$ (or otherwise, by the unateness of $\mathbf{F}$ , we would have $\mathbf{F}(\mathbf{x}_0, \mathbf{y}_0) \leq \mathbf{F}(\mathbf{x}_0, \mathbf{z}_0)$ ). But if there exists an index i such that $y_{i,0} = 1$ , $z_{i,0} = 0$ , then the rightmost term of Eq. (29) takes value 1, and the right-hand side of the inequality holds, a contradiction. Therefore, $\mathbf{F}_{min}(\mathbf{x}) \leq \mathbf{F}(\mathbf{x}, \mathbf{y})$ together with Eq. (29) implies $$\mathbf{F}_{min}(\mathbf{x}) \leq \mathbf{F}(\mathbf{x}, \mathbf{z}) + \sum_{j=1}^{m} y_j(z_j' \mathbf{1})$$ To complete the proof, it must now be shown that $\mathbf{F}_{min}(\mathbf{x}) \leq \mathbf{F}(\mathbf{x}, \mathbf{z}) + \sum_{j=1}^{m} y_j(z_j'\mathbf{1})$ , $\forall \mathbf{z}$ implies $\mathbf{F}_{min}(\mathbf{x}) \leq \mathbf{F}(\mathbf{x}, \mathbf{y})$ . Suppose, by contradiction, that this is not true. There exists then a value $\mathbf{x}_0, \mathbf{y}_0$ such that some component of $\mathbf{F}_{min}(\mathbf{x})$ takes value 1, $\mathbf{F}(\mathbf{x}_0, \mathbf{y}_0)$ takes value 0, but $\mathbf{F}_{min}(\mathbf{x}_0) \leq \mathbf{F}(\mathbf{x}_0, \mathbf{z}) + \sum_{j=1}^{m} y_{j,0}(z_j'\mathbf{1})$ , $\forall \mathbf{z}$ . In this case, it must be $\mathbf{F}(\mathbf{x}_0, \mathbf{z}) + \sum_{j=1}^{m} y_{j,0}(z_j'\mathbf{1}) = 1$ , regardless of $\mathbf{z}$ . But this implies that, for $\mathbf{z} = \mathbf{y}_0$ , $\mathbf{F}(\mathbf{x}_0, \mathbf{z}) = 1$ , i.e. $\mathbf{F}(\mathbf{x}_0, \mathbf{y}_0) = 1$ , a contradiction. Eq. (28) has the same format of Eq. (15), with q and $\mathbf{p}_j$ being replaced by $\mathbf{F}(\mathbf{x}, \mathbf{z})$ and $z_j'\mathbf{1}$ , respectively. Theorem (5.2) thus allows us to reduce the covering step to the one used for compatible gates. Theorems (5.1) and (5.2) show that the algorithms presented in Section 3 can be used to optimize arbitrary sets of gates with the same parity, without being restricted to sets of compatible gates only. ### 6 Implementation and Results The implementation of the algorithms presented in Sections 3 and 5 is as follows. The original networks are first transformed into a unate, NOR-only description. All internal functions are represented using BDDs [15]. For each unoptimized gate $g_i$ , the following heuristic is used. First, we try to find a set of compatible gates for $g_i$ , called S<sub>i</sub>. In the case where not enough compatible gates can be found, we find a set of gates that are unate with respect to $g_i$ , called $S_a$ . In the case where $S_c$ is optimized, we use Eq. (14) to extract the functions $\mathbf{p}_j$ and q. In particular, q is computed by setting $y_j$ to 0. The functions $\mathbf{p}_j$ are then computed by setting $y_j$ to 1, with $y_i$ ; $i \neq j$ stuck-at 0. In the case of optimizing arbitrary unate network $S_a$ , Theorem (5.1) is used to determine the maximal functions for each $y_j$ . Note that optimizing $S_c$ is preferable because for a set of m compatible gates, m+1 computations for $p_j$ and q are needed to obtain all the required **don't cares**. For $S_a$ , two computations (with $y_j$ stuck-at-0 and stuck-at-1) are required for the extraction of the **don't care** set of each variable $y_j$ , resulting in a total of 2m computations. A set of primes for the gate outputs is then constructed. Because of the large possible set of primes, we limit our prime selection to single-literal primes only. The BDD of $\mathbf{F}(\mathbf{x}, \mathbf{z})$ is then built, and the covering problem solved. Networks are then iteratively optimized until no improvement occurs, and eventually folded back to a binate form. The algorithms presented in this paper were implemented in C program called **ACHILLES**, and tested against a set of **MCNC** synthesis benchmarks. Table (2) provides a comparison of **ACHILLES** with SIS using **script.rugged**. The column **Initial Stat**. lists the network statistics before optimization, where **Int**. is number of internal interconnections and **gates** is the gate count. The column **Interconn**. shows number of interconnections after optimization. The **gates** column compares final gate counts. **Literal** column shows the final literals in factored form. The results in the table show that **ACHILLES** performs better than SIS for all | | Initial Stat. | | Interconn. | | Literals(fac) | | Gates | | CPU time | | |---------|---------------|-------|------------|------|---------------|-----|----------|-----|----------|--------| | Circuit | Int. | Gates | Achilles | SIS | Achilles | SIS | Achilles | SIS | Achilles | SIS | | cm85a | 108 | 63 | 67 | 77 | 42 | 46 | 31 | 34 | 1.5 | 1.2 | | cm162a | 113 | 60 | 99 | 102 | 47 | 49 | 41 | 52 | 1.8 | 1.3 | | pm1 | 130 | 60 | 67 | 78 | 47 | 52 | 31 | 36 | 1.6 | 1.3 | | 9symml | 375 | 152 | 288 | 325 | 163 | 186 | 88 | 101 | 108.4 | 64.2 | | alu2 | 924 | 262 | 366 | 570 | 303 | 362 | 215 | 231 | 309.7 | 403.0 | | alu4 | 1682 | 521 | 902 | 1128 | 612 | 703 | 420 | 487 | 1612.6 | 1718.5 | | apex6 | 1141 | 745 | 1009 | 1315 | 687 | 743 | 589 | 639 | 115.1 | 30.3 | | c499 | 945 | 530 | 913 | 945 | 505 | 552 | 498 | 530 | 202.1 | 133.6 | | C880 | 797 | 458 | 643 | 731 | 355 | 409 | 295 | 342 | 340.6 | 30.7 | | _C1908 | 936 | 489 | 828 | 891 | 518 | 542 | 445 | 482 | 422.1 | 138.8 | Table 2: Optimization Results. Runtimes are in seconds on DEC5000/240. figures of merits. In particular, ACHILLES does 11% better than Sis in factored literals. Note that **script.rugged** was chosen because it is the most robust script of the **Sis** script suite, and it matches closely to our type of optimization. Our objective was to compare optimization results based only on Boolean operations, namely compatible gates versus **don't cares**. The **script.rugged** calls **full\_simplify**[16], which computes observability **don't cares** to optimize the network. The table shows that the **ACHILLES** runtimes are competitive with that of **Sis.** In this implementation, we are more interested in the quality of the optimization than the efficiency of the algorithms, therefore an **exact** covering solver is used. We can improve the runtime in the future by substituting a faster heuristic or approximate solvers (such as used in Espresso [7]). #### 7 Conclusion In this paper we presented a comparative analysis of approaches to multi-level logic optimization, and described new algorithms for simultaneous multiple-gate optimization. The algorithms are based on the notion of **compatible gates** and unate networks. We identify the main advantage of the present approach over previous solutions in its capability of exact minimization of suitable multiple-output networks, by means of traditional two-level optimization algorithms. Experimental results show an improvement of 11% over existing methods. # 8 Acknowledgement This research is sponsored by NSF and DEC under a PYI award and by ARPA and NSF under contract MIP 9115432. #### References - [1] W. Quine, "The problem of simplifying truth functions," *American Mathematical Monthly*, vol. 59, no. 8, pp. 521-531, 1952. - [2] E. J. McCluskey, "Minimization of Boolean functions," *Bell Syst. tech J.*, vol. 35, no. 5, pp. 1417-1444, Nov. 1956. - [3] P. McGeer, J. Sanghavi, and R. K. Brayton, "Espresso-signature: A new exact minimizer for logic functions," in *DAC*, *Proceedings of the Design Automation Conference*, pp. 618-624, June 1993. - [4] 0. Coudert, J. Madre, and H. Fraisse, "A new viewpoint on two-level logic minimization," in **DAC, Proceedings** of **the Design Automation Conference**, pp. 625–630, June 1993. - [5] S. J. Hong, R. G. Cain, and D. L. Ostapko, "Mini: A heuristic approach to logic minimization," IBM Journal of Research and Development, 1974. - [6] R. K. Brayton, G. D. Hachtel, C. T. McMullen, and A. Sangiovanni-Vincentelli, Logic Minimization Algorithms for VLSI Synthesis. Kluwer Academic, 1984. - [7] R. Rudell and A. Sangiovanni-Vincentelli, "Multiple-valued minimization for PLA optimization," *IEEE Transactions on CAD/ICAS*, vol. 6, no. 5, pp. 727–750, Sept. 1987. - [8] E. L. Lawler, "An approach to multilevel boolean minimization," *ACM Journal*, vol. 11, no. 3, pp. 283-295, July 1964. - [9] R. K. Brayton, R. Rudell, A. Sangiovanni-Vincentelli, and A. R. Wang, "MIS: A multiple-level logic optimization system," *IEEE Transactions on CAD/ICAS*, vol. 6, no. 6, pp. 1062-1081, Nov. 1987. - [10] S. Muroga, Y. Kambayashi, H. C. Lai, and J. N. Culliney, "The transduction method design of logic networks based on permissible functions," *IEEE Transactions on Computers*, vol. 38, no. 10, pp. 1404–1424, Oct. 1989. - [11] K. A. Bartlett, R. K. Brayton, G. D. Hachtel, R. M. Jacoby, C. R. Morrison, R. L. Rudell, A. Sangiovanni-Vincentelli, and A. R. Wang, "Multilevel logic minimization using implicit don't cares," *IEEE Transactions on CAD/ICAS*, vol. 7, no. 6, pp. 723–740, June 1988. - [12] R. Brayton and F. Somenzi, "An exact minimizer for boolean relations," in *ICCAD*, *Proceedings of the International Conference on Computer-Aided Design*, pp. 316-319, Nov. 1989. - [13] E. J. McCluskey, *Logic Design Principles With Emphasis on Testable Semicustom Circuits*. Prentice-Hall, 1986. - [14] S. W. Jeong and F. Somenzi, "A new algorithm for the binate covering problem and its application to the minimization of boolean relations," in *ICCAD*, *Proceedings of the International Conference on Computer-Aided Design*, pp. 417-420, 1992. - [15] R. E. Bryant, "Graph-based algorithms for boolean function manipulation," *IEEE Transactions on Computers*, vol. 35, no. 8, pp. 677-691, Aug. 1986. - [16] H. Savoj, R. K. Brayton, and H. Touati, "Extracting local don't cares and network optimization," in *ICCAD*, *Proceedings of the International Conference on Computer-Aided Design*, pp. 514-517, Nov. 1991.