[Libre-soc-bugs] [Bug 558] gcc SV intrinsics concept

Mon Jan 11 23:38:50 GMT 2021

https://bugs.libre-soc.org/show_bug.cgi?id=558

--- Comment #44 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #42)
> (In reply to Jacob Lifshay from comment #40)
> > I think that CRs should be handled entirely by the gcc backend, they will
> > have mask values allocated to them automatically.
> 
> in the "final" version (the final target) i could not agree more. results
> (all results) can have a CR associated with them: when Vectorised this
> becomes a Vector of associated CRs and i would anticipate there not being
> much in the way of changes to gcc-ppc64 scalar code to transform it to
> support "scalar is now vector yes even CRs"

I meant that the concept of CRs doesn't show up at all in the compiler
frontend/ir (except for inline assembly) and only appears when the generic ir
is translated to PowerPC-specific code in the backend as part of instruction
selection and register allocation. That's how CRs are currently handled in LLVM
anyway.

Masks are just vectors of bits (conceptually a single bit per element), and the
compiler frontend only ever sees the integer register representation.

The CRs are treated as a single vector, not a group of 4 vectors.

This matches with the current compiler-level representation of a CR register
which I've deduced is just a single bit per CR field (not 4), along with a
statically-calculated by the instruction-selector indication of which bit to
use:

int a < b produces:
cmp dest_cr, a, b
with dest_cr.lt being the bit to use

int a <= b produces:
cmp dest_cr, a, b
with !dest_cr.gt being the bit to use

int a == b produces:
cmp dest_cr, a, b
with dest_cr.eq being the bit to use

int a != b produces:
cmp dest_cr, a, b
with !dest_cr.eq being the bit to use

int a > b produces:
cmp dest_cr, a, b
with dest_cr.gt being the bit to use

int a >= b produces:
cmp dest_cr, a, b
with !dest_cr.lt being the bit to use

float a < b produces:
fcmpu dest_cr, a, b
with dest_cr.lt being the bit to use

float a <= b produces:
fcmpu dest_cr, a, b
cror dest_cr.eq, dest_cr.lt, dest_cr.eq
with dest_cr.eq being the bit to use

float a == b produces:
fcmpu dest_cr, a, b
with dest_cr.eq being the bit to use

float a != b produces:
fcmpu dest_cr, a, b
with !dest_cr.eq being the bit to use

float a > b produces:
fcmpu dest_cr, a, b
with dest_cr.gt being the bit to use

float a >= b produces:
fcmpu dest_cr, a, b
cror dest_cr.eq, dest_cr.lt, dest_cr.eq
with dest_cr.eq being the bit to use

Collisions with other instructions that use CRs is easily handled by the
register allocator, all we need to do is tell it which registers overlap with
which other registers and it handles all the rest without needing modification
(except for supporting allocating ranges of registers instead of single
registers).

-- 
You are receiving this mail because:
You are on the CC list for the bug.