[libre-riscv-dev] SVprefix v0.2
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Feb 18 08:10:18 GMT 2019
Ok so twin predication, which is an extremely powerful way to express all
of gather, scatter, splat, insert, extract, is missing.
Twin predication in the original SV is present on all 2op instructions,
including LD, ST, MV, FCVT, FCLASS, FSGN, and the int to fp conversions.
Without twin predication that entire suite of instructions is missing some
very powerful benefits, which have to be substituted by explicit
instructions.
That in turn, being impossible to consider adding all of them, means that
only a handful could be considered, and the general idea is to avoid adding
any at all.
Twin predication can be added by giving an alternative meaning to the
predicate bits for 2-op instructions. 00: no predication. 01: single
predication on src using pr1. 10: single predication on dest using pr2. 11:
twin predication using pr1 and pr2. More bits if available can invert some
of pr1/pr2 combinations etc. However not as high a priority as src dest
predication.
That in turn also removes the need for the gather/scatter state in LD/ST,
and no need for a gather scatter instruction.
Also there is no bit for specifying zeroing or non zeroing mode in
predication. If there is no room, non zeroing (skipping elements) would i
feel be preferable as it allows interleave of predicate with inverted
predicate masks, to give 100% ALU utilisation on OoO.
Which, if decided, would need to be documented.
Twin predication is I realise very odd. I have never encountered a design
that has it. I do not know why. I do know that its implementation may
require a serial algorithm for certain combinations (or that getting full
parallelism may be tricky).
Certainly, the OoO instructiin issue will need to be stalled when a twin
predication op is encountered, as it is not only impossible to know how
many instructions will need to be issued, you have no idea which registers
will be needed either.
I know we discussed a potential way round that (reserving a *range* of
registers), it may still apply, here, just that there are two ranges to
reserve not one.
L.
--
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
More information about the libre-riscv-dev
mailing list