[libre-riscv-dev] [Bug 296] idea: cyclic buffer between FUs and register file
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Fri May 1 03:10:01 BST 2020
https://bugs.libre-soc.org/show_bug.cgi?id=296
--- Comment #1 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
an idea for a cyclic hopper:
here is an
https://groups.google.com/d/msg/comp.arch/qeMsE7UxvlI/6nvrtmBoAQAJ
> Since you separated the signals, can I suggest placing a hopper between
> reading register specifiers out of FUs and the RF ports. Each cycle a
> number of specifiers are dropped into the hopper, and each cycle as many
> RF read ports as you have are read and then delivered to instructions
> at the calculation units. Then each calculation unit picks instructions
> into execution.
*sigh* unfortunately we are a little bit hamstrung by having lost the register
binary indices, using instead the Dependency Matrix unary encoding to
directly enable the regfile row. i mean, we _could_ use the unary encoding
as the register specifier...
by "hopper" do i take it that you mean a cyclic shift register where each
bucket contains the register specifier, type of operation (R/W), and, say, the
FU
number?
or, actually, if the register specifier is in unary...
oo i have an idea. it may well be the same idea.
* for all registers being requested by all FUs, merge the unary register
indices
into a single "these registers need to be read" vector. i think, in fact,
that
the Dependency Matrices might actually have this already (Read Vector
and Write Vector Pending)
* for reads, throw that vector at a regfile "fetcher", which, using the read
vector,
simply fetches as many registers as there are available regfile ports
let's say that there are 4R regfile ports
* the 4 read results, along with a *single* bit unary encoding of their
corresponding
reg number, are dropped into a cyclic 4-wide buffer (cyclic 4-wide shift reg)
* all FUs "GoRead1" are connected to port 1 of the cyclic buffer
* all FUs "GoRead2" are connected to port 2...
* ............GoRead3 ..... 3
* entries *remain* in the cyclic buffer for as long as the corresponding bit in
the
Read Vector remains set, continuing to cycle through, moving from port 1 to
2, 2 to 3, 3 to 4, 4 back to 1, on each clock
* on any given clock cycle, *all* FUs may read that "broadcasted" entry from
their
corresponding cyclic buffer "hopper", as long as the unary reg index matches
up
* when the reg index matches up, GO_RD1/2/3 is fired at that FU.
* when the FU has read from that port, it may drop the REQ_RD1/2/3
* this results in the (usual) dropping of the unary bit from the Read Vector
back at
the Dependency Matrix
* this in turn also triggers the cyclic buffer to drop the reg value, freeing
up that
port for a refill opportunity.
it's by no means perfect: the cyclic buffer moving only one value at a time
means
that if an FU has 3 GO_RD/REQ_RD lines, it could have to wait for up to 3
cycles
for the data to cycle round the shift-buffer... *per read-register*.
if however a concerted effort is made to ensure that a REQ_RD1 always tries to
drop
the read-request into cyclic-buffer entry 1, REQ_RD2 always tries to drop into
entry 2
and so on, then the reads should just pass straight through.
this does have the advantage that if there are multiple FUs waiting for the
same
register, then as a "broadcast" bus system, they can in fact get the same value
simultaneously.
writes are very similar, and there is the additional advantage that the written
value
can be read from the cyclic buffer by any FU that "sees" it, prior to the value
landing
at the actual regfile.
thus the cyclic buffer also happens to serve double-duty as a forwarding bus.
or,
more to the point, the role and distinction between "register bus" and
"forwarding
bus" is now decoupled / moot.
was that sort-of what you meant by "hopper"? :)
l.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-riscv-dev
mailing list