Report Number: CSL-TR-89-397
Institution: Stanford University, Computer Systems Laboratory
Title: Design and Clocking of VLSI Multipliers
Author: Santoro, Mark Ronald
Date: October 1989
Abstract: This thesis presents a versatile new multiplier
architecture, which can provide better performance
than conventional linear arry multipliers at a
fraction of the silicon area. The high performance
is obtained by using a new binary tree structure,
the 4-2 tree. The 4-2 tree is symmetric and far
more regular than other multiplier trees while
offering comparable performance, making it better
suited for VLSI implementations. To reduce area, a
partial, pipelined 4-2 tree is used with a 4-2
carry-save accumulator placed at its outputs to
iteratively sum the partial products as they are
generated. Maximum performance is obtained by
accurately matching the iterative clock to the
pipeline rate of the 4-2 tree, using a stoppable
on-chip clock generator.
To prove the new architecture a test chip, called
SPIM, was fabricated in a 1.6 (Mu)m CMOS process.
SPIM contains 41,000 transistors with an array size
of 2.9 X 5.3 mm. Running at an internal clock
frequency of 85 MHz, SPIM performs the 64 bit mantissa
portion of a double extended precision floating-point
multiply in under 120 ns. To make the new architecture
commercially interesting, several high-performance rounding
algorithms compatible with IEEE standard 754 for
binary floating-point arithmetic have also been developed.
http://i.stanford.edu/pub/cstr/reports/csl/tr/89/397/CSL-TR-89-397.pdf