Report Number: CSL-TR-93-556
Institution: Stanford University, Computer Systems Laboratory
Title: Support for Speculative Execution in High-Performance Processors
Author: Smith, Michael David
Date: November 1992
Abstract: Superscalar and superpipelining techniques increase the
overlap between the instructions in a pipelined processor,
and thus these techniques have the potential to improve
processor performance by decreasing the average number of cycles
between the execution of adjacent instructions. Yet, to
obtain this potential performance benefit, an instruction
scheduler for this high-performance processor must find the
independent instructions within the instruction stream of an
application to execute in parallel. For non-numerical
applications, there is an insufficient number of independent
instructions within a basic block, and consequently the
instruction scheduler must search across the basic block
boundaries for the extra instruction-level parallelism required
by the superscalar and superpipelining techniques. To exploit
instruction-level parallelism across a conditional branch,
the instruction scheduler must support the movement of
instructions above a conditional branch, and the processor must
support the speculative execution of these instructions.
We define boosting, an architectural mechanism for speculative
execution, that allows us to uncover the instruction-level
parallelism across conditional branches without adversely
affecting the instruction count of the application or the
cycle time of the processor. Under boosting, the compiler is
responsible for analyzing and scheduling instructions, while
the hardware is responsible for ensuring that the effects of
a speculatively-executed instruction do not corrupt the
program state when the compiler is incorrect in its speculation.
To experiment with boosting, we built a global instruction
scheduler, which is specifically tailored for the non-numerical
environment, and a simulator, which determines the cycle-count
performance of our globally-scheduled programs. We also
analyzed the hardware requirements for boosting in a typical
load/store architecture. Through the cycle-count
simulations and an understanding of the cycle-time impact of
the hardware support for boosting, we found that only a small
amount of hardware support for speculative execution is
necessary to achieve good performance in a small-issue,
processor.
http://i.stanford.edu/pub/cstr/reports/csl/tr/93/556/CSL-TR-93-556.pdf