[libre-riscv-dev] [Bug 206] Implement branch prediction

Thu Mar 12 00:13:46 GMT 2020

http://bugs.libre-riscv.org/show_bug.cgi?id=206

--- Comment #27 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
from comp.arch TODO get link to preserve attribution

P
Paul A. Clayton
to
2 hours agoDetails
On Wednesday, March 11, 2020 at 4:31:56 PM UTC-4, Stefan Monnier wrote:
> > the difference between a global history table and a local history table:
> > a global table takes some sort of hash (or lru) of the entire PC, whilst the
> > local table will take say the 10 LSBs?
>
> IIRC the different is that a local history table is indexed by (a hash
> of) the PC of the branch instruction, whereas the global history table
> is indexed by (a hash of) the trace of branch decisions taken before
> reaching the branch instruction.
>
> So a 10bit index could be respectively:
> - for an LHT, the 10 LSb of the address of the branch instruction
> - for an GHT, the direction (taken/not taken) of the last 10 branch
>   instructions executed.
>
> Then again, maybe I'm just confused.

Traditional gshare global tables are indexed by an XORing of
branch address bits (typically just the least significant
bits of the branch address) and a taken/not-taken global history
string. More recent advances have hashed larger history strings
and used what is called path history (information from addresses
of branches) rather than just direction (taken/not-taken) history
with more complex hashing of the index.

TAGE-based predictors use partial tags to reduce aliasing and
(I suspect) accelerate training and use GEometric history
lengths (which provides fast training and the ability to use
very long history strings).

(One idea I would like to test — I doubt it would be helpful —
is using bimode-like storage for a TAGE with different predictions
using different slots. This would effectively require doubling
look-up width, but with overlaided skewed associativity the
problem of capacity bias would be reduced. I did try XORing a
tag bit with the prediction or the hysteresis bit for a per
address prediction and it turned out that constructive/non-
destructive aliasing is more common than I supposed [but the
test also did not filter not-taken branches using a BTB]. I
also suspect that dynamic sizing of TAGE tables for various
lengths could also be beneficial.)

Most local history predictors in current use (excluding, e.g.,
that in the Alpha 210264) only use a saturating counter per
address rather than tracking local history strings and
predicting based on those either with a separate table or a
hardwired algorithm.

-- 
You are receiving this mail because:
You are on the CC list for the bug.