Report Number: CSL-TR-91-480
Institution: Stanford University, Computer Systems Laboratory
Title: Strategies for branch target buffers
Author: Bray, Brian K.
Author: Flynn, M. J.
Date: June 1991
Abstract: Achieving high instruction issue rates depends on the ability to dynamically predict branches. We compare two schemes for dynamic branch prediction: a separate branch target buffer and an instruction cache based branch target buffer. For instruction caches of 4KB and greater, instruction cache based branch prediction performance is a strong function of line size, and a weak function of instruction cache size. An instruction cache based branch target buffer with a line size of 8 (or 4) instructions performs about as well as a separate branch target buffer structure which has 64 (or 256, respectively) entries. Software can rearrange basic blocks in a procedure to reduce the number of taken branches, thus reducing the amount of branch prediction hardware needed. With software assistance, predicting all branches as not branching performs as well as a 4 entry branch target buffer without assistance, and a 4 entry branch target buffer with assistance performs as well as a 32 entry branch target buffer without assistance. The instruction cache based branch target buffer also benefits from the software, but only for line sizes of more than 4 instructions.
http://i.stanford.edu/pub/cstr/reports/csl/tr/91/480/CSL-TR-91-480.pdf