[libre-riscv-dev] Scoreboard vs Tomasulo

Sat May 16 16:52:10 BST 2020

https://www.cs.umd.edu/~meesh/cmsc411/website/projects/dynamic/modern.html

this entire section is misleading in several ways. it's not wrong per se,
it's just not the only solution.

the *other* solution is to have "precise shadowing".

this is a system (described at the end of Mitch 2nd chapter) which allows
result *computation* to run ahead but where result *commit* is prevented
and prohibited.

(the gate count required to do so is extremely low.)

this on the face of it would seem to stop FUs from progressing because they
cannot do so until they receive their operands... and um how could they do
that if writing to the regfile is prohibited?

the answer is very simple: that is what Operand Forwarding is for, aka
"Common Data Bus(es)".

so by preventing write commit, we can run ahead for a considerable time, no
stalling at all.

when the branch prediction resolves, we then either:

* release the shadow which allows a ton of results to begin committing.
order is not strictly relevant except as preserved by the hazard detection

* call the "go die" line which drops absolutely every downstream result and
partial computation underway on the floor.  the PC then issues ops down the
alternative path.

(i did actually try dual path execution: theoretically our engine can do
it, i just could not get it to work last year, it was taking too long to
debug, and i had to focus on other things).

the thing that is particularly misleading about this particular section is
that it instills in the reader's mind the false impression that there is a
mandatory correlation between speculative execution and operand order
committing.

this is just false.

if the commit can be stalled until it is guaranteed known that, after that
point it is perfectly *safe* to commit (the operation is 100% going to
succeed)

    AND

if there are no hazards remaining

   THEN

who _cares_ what order results are committed? answer: nobody.

so... useful inasmuch as it is written by an academic and thus is really
readable.

sadly though it contains the usual glaring errors, misunderstandings and
omissions which are widely pervasive and common throughout the *entire*
academic world, even by those who claim to have studied 6600 scoreboards,
even by those who claim to be experts on RISC design.

for the most part that involves cursory reading the patent on 6600 Q
Tables, *not* studying Thornton's book in great depth and taking the time
to translate the (now archaic) transistor level diagrams into modern IEEE
standard electronics symbols - something that Mitch went to a lot of
trouble to do.

i trust that this gives you something of an eyeopener into the extent of
the misapprehensions that the Academic community is under, regarding out of
order engine design.

the closest working implementation i have seen similar to what we are doing
is a conversion of an existing RV32 design to superscalar and multi issue
by a chinese guy who has not revealed his name, online.

he also did multi-bit unary transitive relationships and dependency
tracking in exactly the same way that we will, having re-derived 6600
scoreboards with no reference to any academic literature.

however he did not at the same time increase load store data bandwidth, and
on top of RV32I it is a hell of a lot of extra work to bring up to where we
can use it... and it's all in verilog.

l.

-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68