[Libre-soc-bugs] [Bug 558] gcc SV intrinsics concept

Mon Jan 11 05:00:56 GMT 2021

https://bugs.libre-soc.org/show_bug.cgi?id=558

--- Comment #36 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
ok so to recap.

* VSX is SIMD system with a fixed byte width and no support for predication
* SV is a Cray-style vaiable length Vector system with dynamic variable
register allocation and advanced predication
* we are NOT going to try to "map" SV onto VSX at the hardware level: VSX,
being based on SIMD, is a known harmful concept.
* we ARE going to implement SIMD at the BACKEND:  this will NOT be visible to
the user.  the user will ONLY interact with the SIMD ALUs via the
Variable-Length SV ISA.

the mismatch between those two means that any efforts to try to examine the VSX
OpenPOWER support in gcc and "adapt" it to SV are also pretty much 100%
guaranteed to fail.  even examining the VSX support will lead to confusion and
misleading ideas.

* VSX with its fixed 16 byte range may only be matched by loop patterns that
match exactly the 16 bytes.  only 2 DWORDs fit into 16 bytes but 4 WORDs fit.
* SV has a variable vector length where MVL is specified in ELEMENTs.  MVL=8
takes 8 64bit registers for 8 64 bit computations, 4 64bit registers for 8 32
bit computations, down to only ONE 64bit register for 8x 8bit computations.

* just as in the SIMD Considered Harmful article VSX end of loop is forced to
engage in insidious "cleanup" dealing with less than the power-of-two
* SV the loop deals with it.

thus, Alexandre, can you see, based on the understanding you've gained over the
past week (we are not mapping to VSX internally, MVL sets an allocation of regs
for use by VL) that there is virtually nothing within the gcc support for VSX
that can be used?

i.e. that the expectation you had, that the direction i was describing, would
be a waste of time and need throwing away, is flawed?

rather, it is the entire VSX SIMD code that needs to be disregarded, and other
backends examined for clues:

* AVX512 (because it has predication)
* SVE2 if it's landed because likewise,
  and it has fail-first on LD.
* RVV because it is the first modern
  Cray Vector ISA in several decades.

now, whilst i very much wanted the following type of construct due to its
startling similarity to the vector loop optimisation:

   __attribute__(SV.sat, etc) {
        statements
        ...
   }

it may instead be enough to carry the attributes on types, variables and
functions.  place the saturation attribute on the variable, and on assignment
from another variable saturation occurs.

also interestingly there already exists a vector_size attribute which LITERALLY
directly maps to MVL:
 https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html#Vector-Extensions

my feeling is that rather than add several tens of thousands of intrinsics
autogenerated (the software equivalent of propagating "SIMD considered
harmful"), __attribute__ could well provide the means to bring in SVP64
Context, with very little needed to be added with the exception of setvl.

of course, unusual intrinsics yes: the new set-before-first opcode, the
integer-max opcode etc etc these will all need adding...  as *scalar* opcodes. 
later the vector-based ones such as conflictd and so on.

attribute can add:

* subvl
* indicate which variable is to be a predicate source
* saturation
* even mapreduce mode

interestingly it would not introduce elwidth overrides, these would be derived
from the types uint64_t interacting with uint32_t etc etc.

-- 
You are receiving this mail because:
You are on the CC list for the bug.