[Libre-soc-isa] [Bug 533] design new CR instructions suitable for predication

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sun Nov 29 12:43:39 GMT 2020


https://bugs.libre-soc.org/show_bug.cgi?id=533

--- Comment #5 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #2)
> What I was thinking of is more like an instruction that reads a single 4-bit
> CR field and writes a 1 or a 0 to an integer register based on if the 4-bit
> CR field matches some condition. That instruction when vectorized would
> produce a bit-vector in the integer register with 1 bit per element.
> 
> The instruction would have a 4-bit immediate with bits a, b, c, and d. The
> output of the instruction would be:
> (a & cr_lt) | (b & cr_eq) | (c & cr_gt) | (d & cr_unordered)

an enhancement of that - taking cues from branch - is to use a mask-and-eq
(mask-and-xor)

   (mask0 & (a == cr_lt))

the only thing being it requires 8 bits, which would have to be checked
if that's ok.

bear in mind that CRs are not just used for eq/lt/gt, they're used for
chains of complex bitwise operations.

however... question is: should this new instruction be made more complex,
substituting for multiple crand/or/other ops?

> (icr if that's the right order for cr bits, but you get the idea).

yehyeh.

> This allows producing any pattern of ones and zeros assuming the cr is set
> to the result of an integer or fp compare op. For int compares, we set d =
> 0. For int/fp compares, the rest of the bits select the output value that
> should be generated for a particular compare result:
> lt -> output = a
> eq -> output = b
> gt -> output = c
> unordered (fp only) -> output = d

(In reply to Luke Kenneth Casson Leighton from comment #4)

> SV vectorised, because strictly speaking it's a normal arithmetic op (2 CR
> src, 1 int dest), and normal arithmetic SV ops are treated as

correction: it is only read=1 CR, write=1 int.

what about the other way round?  what about when writing?  mfcr and mfocr
don't do "spreadout" like this.  what about using the same 4-bit (8-bit?)
mask to take 1 bit of int and target multiple bits in CR?

also, argh: it can't be done on the full 32-bit CR because it violates the
whole thing of SV Vector Length (targetting the full 32-bit CR means
multiplying
by 8).

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libre-SOC-ISA mailing list