Report Number: CSL-TR-96-689
Institution: Stanford University, Computer Systems Laboratory
Title: A Variable Latency Pipelined Floating-Point Adder
Author: Oberman, Stuart F.
Author: Flynn, Michael J.
Date: February 1996
Abstract: Addition is the most frequent floating-point operation in modern microprocessors. Due to its complex shift-add-shift-round dataflow, floating-point addition can have a long latency. To achieve maximum system performance, it is necessary to design the floating-point adder to have minimum latency, while still providing maximum throughput. This paper proposes a new floating-point addition algorithm which exploits the ability of dynamically-scheduled processors to utilize functional units which complete in variable time. By recognizing that certain operand combinations do not require all of the steps in the complex addition dataflow, the average latency is reduced. Simulation on SPECfp92 applications demonstrates that a speedup in average addition latency of 1.33 can be achieved using this algorithm, while still maintaining single cycle throughput.
http://i.stanford.edu/pub/cstr/reports/csl/tr/96/689/CSL-TR-96-689.pdf