Report Number: CSL-TR-89-397
Institution: Stanford University, Computer Systems Laboratory
Title: Design and Clocking of VLSI Multipliers
Author: Santoro, Mark Ronald
Date: October 1989
Abstract: This thesis presents a versatile new multiplier architecture, which can provide better performance than conventional linear arry multipliers at a fraction of the silicon area. The high performance is obtained by using a new binary tree structure, the 4-2 tree. The 4-2 tree is symmetric and far more regular than other multiplier trees while offering comparable performance, making it better suited for VLSI implementations. To reduce area, a partial, pipelined 4-2 tree is used with a 4-2 carry-save accumulator placed at its outputs to iteratively sum the partial products as they are generated. Maximum performance is obtained by accurately matching the iterative clock to the pipeline rate of the 4-2 tree, using a stoppable on-chip clock generator. To prove the new architecture a test chip, called SPIM, was fabricated in a 1.6 (Mu)m CMOS process. SPIM contains 41,000 transistors with an array size of 2.9 X 5.3 mm. Running at an internal clock frequency of 85 MHz, SPIM performs the 64 bit mantissa portion of a double extended precision floating-point multiply in under 120 ns. To make the new architecture commercially interesting, several high-performance rounding algorithms compatible with IEEE standard 754 for binary floating-point arithmetic have also been developed.
http://i.stanford.edu/pub/cstr/reports/csl/tr/89/397/CSL-TR-89-397.pdf