BIB-VERSION:: CS-TR-v2.0 ID:: STAN//CSL-TR-89-397 ENTRY:: May 5, 1999 ORGANIZATION:: Stanford University, Computer Systems Laboratory TITLE:: Design and Clocking of VLSI Multipliers TYPE:: Thesis TYOE:: Technical Report AUTHOR:: Santoro, Mark Ronald DATE:: October 1989 PAGES:: 118 ABSTRACT:: This thesis presents a versatile new multiplier architecture, which can provide better performance than conventional linear arry multipliers at a fraction of the silicon area. The high performance is obtained by using a new binary tree structure, the 4-2 tree. The 4-2 tree is symmetric and far more regular than other multiplier trees while offering comparable performance, making it better suited for VLSI implementations. To reduce area, a partial, pipelined 4-2 tree is used with a 4-2 carry-save accumulator placed at its outputs to iteratively sum the partial products as they are generated. Maximum performance is obtained by accurately matching the iterative clock to the pipeline rate of the 4-2 tree, using a stoppable on-chip clock generator. To prove the new architecture a test chip, called SPIM, was fabricated in a 1.6 (Mu)m CMOS process. SPIM contains 41,000 transistors with an array size of 2.9 X 5.3 mm. Running at an internal clock frequency of 85 MHz, SPIM performs the 64 bit mantissa portion of a double extended precision floating-point multiply in under 120 ns. To make the new architecture commercially interesting, several high-performance rounding algorithms compatible with IEEE standard 754 for binary floating-point arithmetic have also been developed. END:: STAN//CSL-TR-89-397