[libre-riscv-dev] multi issue
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon May 27 14:17:27 BST 2019
There was a discussion on comp.arch where multi issue was raised
https://groups.google.com/d/msg/comp.arch/LXWtd1L9JoY/7P7yifihBQAJ
Also a similar discussion on hw-dev where someone mentioned that they have
created a superscalar design.
Initially I believed that multi issue with the 6600 style scoreboard could
only be achieved by working out which instructions had exclusive use of
registers.
Thus, the logic goes, the Matrices would have no clashes and the
instructions would begin in parallel.
Mitch pointed out that the dependency relationships are transitive, and so
each instruction in the batch can be made to depend on all the registers of
the previous instructions in the batch.
It's that simple.
Commit order must btw also be included, each instruction casting a commit
shadow over all future ones. This in addition ensures that each batch
preserves instruction order, however is different from just having
transitive read and write dependencies.
Note that it is still necessary to work out if the Function Units to which
those instructions are allocated are busy or available, that multiple units
may be available for a given operation, however this is a relatively
straightforward task, requiring a multi priority picker.
We need a multi priority picker for quite a lot of tasks, in fact. One is
needed for the register file port selection, for example.
Basically it can be done by having a "top" priority unit, that is separate
from a "second" priority unit, and so on. The second-priority unit would
note that the 1st priority signal had been selected, and EXCLUDE it from
the bits.
It would be preferable to not have this done in a recursive fashion,
however it is the simpler option.
None of these things are particularly difficult, none of them have huge
gate cost, the point being that it is perfectly achievable now that we have
a working Dependency Matrix system, so we might as well do a proper job.
With that, it means that in theory we could do an ASIC that could take on
high performance workloads. Technically there is nothing, aside from the
multiporting needed for the register file and increasing the memory
bandwidth, that would stop us from doing a quad issue or even greater
design.
L.
--
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
More information about the libre-riscv-dev
mailing list