[Libre-soc-isa] [Bug 213] SimpleV Standard writeup needed

Thu Oct 8 06:05:35 BST 2020

https://bugs.libre-soc.org/show_bug.cgi?id=213

--- Comment #35 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #34)

> the idea is that the compare would produce 1 bit per vector lane and
> essentially directly generate a predicate mask into an integer register. For
> that to work, the compare would need extra bits (normally in the branch
> instruction for scalar powerpc) to know which of lt, le, eq, ne, etc. it
> should use, those bits come from the prefix.
> 
> As long as it's one bit per lane, scalar integer ops are even better than cr
> ops for the required bit manipulations.

i came up with an architectural plan to implement the hidden bitsetting in 6600
style OoO and to be honest it was a bit of a pig.

an exception in the middle required a very messy design.

CRs on the other hand by being treated as actual "real" registers respected and
each given their own Dependency Matrix column are far easier to handle.

exceptions in the middle of that, no problem, just restore VL forloop where it
left off.

bortom line is that PowerISA has condition registers which store results that
you then decide which bits to test to make different branches, i.e. the compare
is separated from the branch *by* the CR.

this is conceptually similar to RV FP compare except it wastes an entire 64 bit
int reg to do it (RV FP cmp stores 1 or 0 in an int reg for FP GT/LT/LE/NE ops
which you then follow up with an integer BEQzero)

PowerISA *specifically* has these 4bit CRs  and i feel we should go with the
flow on that rather than try to invent an alternative condition scheme that
does not mesh with what the original PowerISA designers envisaged (for scalar)

think of it this way: a single bit predicate of compares effectively throws
away the other 2 bits of the same op if using CR, doesn't it?

so to replicate that exact same behaviour it would be necessary to call at
least 3 vector compares (single bit predicate) and even use 3 separate int regs
to do so just to get what could have been done with only a single vector CR
based compare.

unless i have misunderstood this does not sound like a step forwards! :)

-- 
You are receiving this mail because:
You are on the CC list for the bug.