[libre-riscv-dev] Scoreboards

Tue May 14 18:11:13 BST 2019

On Mon, May 13, 2019 at 11:04 PM Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:
> the 6600 instead can have the instruction order preserved as a
> bit-wise linked list of write-dependencies, overloading the FU-FU
> bit-matrix to do so.  i.e. if each instruction is given an (apparently

I've been playing around with some mind-experiments.  It looks like one can
give the 6502 a 50% to 100% performance boost by adopting a scoreboard and
no fewer than 3 identical FUs to execute instructions with.  However, as
I've been diagramming on the whiteboard, I've never had a need to record
write dependencies or use an FU-FU matrix.

This is one of the things about the Q matrix I just "don't get."  Why is it
needed?  It seems like even that is superfluous.  I've read and re-read
numerous times Mitch's chapters, and I just get lost every single time.
Can you help elucidate what value it provides?

>  so i deduce that you chose to stick with binary register numbers
> instead of converting them to unary?

At this point, I'm just drawing boxes and connecting them with arrows, so
it's all still symbolic.  I consider binary vs unary an implementation
detail at this level of abstraction.  If my understanding is correct (and
it may not be), gate count notwithstanding, they both should yield
identical results.

> hmmm if the names were those that are used in mitch's book chapters
> i'd have an easier time understanding.  also, mitch himself could
> comment.

After re-reading both Thornton's book and Mitch Alsup's chapters, I decided
to synthesize what I've known before with that I think I just learned.  I
derived my design from my understanding of first principles, using terms
I'm familiar with from the contemporary popular press, which even Mitch's
enhanced vocabulary doesn't match (regrettably).  For example, when I read
GO_WRITE, my brain registers that as, "OK, now it's time to drive the bus
to write to the register file."  But, that's not actually what happens;
when GO_WRITE is asserted, it *appears* to mean that the results have
*already* been written to the register file and it's now OK for the FU to
become idle.  It's deeply counter-intuitive to me.  I wanted my names to
more accurately reflect what was happening as *I* understood things; since
I'm most familiar with synchronous, edge-triggered designs found in FPGAs,
signals indicate what /will/ happen, not what /has/ happened.  I figured
once I had that, I could extrapolate and relabel signals with greater
understanding later on.

The closest (but apparently not quite perfect) analogies between my signals
and Mitch's are:

| Mitch's Signal                                  | My Signal
                                                                |
|-------------------------------------------------+-----------------------------------------------------------------------------------------|
| (Hinted at schematically; but left unspecified) | WBD (register writeback
data bus)                                                       |
| (Hinted at schematically; but left unspecified) | WBS (register writeback
register select)                                                |
| (Not specified at all)                          | WBVALID (register
writeback bus has valid data)                                         |
| BUSY                                            | BUSY_FU
                                                                |
| GO_READ                                         | (generated internally
based on BUSY_FU and what's driven on the register writeback bus) |
| GO_WRITE                                        | RETIRE_FU
                                                                |
| ISSUE                                           | ISSUE_FU
                                                                 |
| RELEASE_REQUEST                                 | RETIRE_REQUEST_FU
                                                                |

>  those both then go each through a priority-picker (one each for read
> and write, separately), and that's how you know to pick one AND ONLY
> one Function Unit to be allowed to read (access to the READ ports of
> the regfile), and one AND ONLY one to be allowed to write (access to
> the WRITE ports of the regfile).

I can see this kind of logic for protecting the write port; but, you could
have 2n read ports on the register file (where n = # of FUs), though, yes?
That seems like it'd save a few cycles (at the expense of more wires coming
from the register file and/or more block RAMs used to implement the file in
an FPGA).

-- 
Samuel A. Falvo II