[libre-riscv-dev] register requirements of SimpleV

lkcl lkcl at libre-riscv.org
Wed Oct 10 21:57:39 BST 2018


On Wed, Oct 10, 2018 at 9:45 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> Note that the semantics implied by the Vulkan standard is SIMT, so if I
> were to translate that directly to SV, the inner loop would be the entire
> shader.

 bit of cross-over, so have a look at what i just wrote about
subsetting the CSR table by being able to flip between 4 different
STATEs.

> After some more thought, a solution that doesn't involve as much change to
> SV as supporting grouped elements would be to use struct-of-arrays
> representation instead of array-of-structs representation. That will lead
> to inefficiencies when VL is low (reverting to scalar operations for
> fixed-length vectors as they effectively have 1 group, which we might want
> to special-case in the compiler).

 i'm not totally following the reasoning, here, i'd probably need to
see some pseudo-code, perhaps on that same page as before?

> The reasoning for the sign-extension/1-extension is to allow
> register-renaming implementations to not need an additional source register
> for the old contents of rd.

 oh, so no need to "read" rd, pass it in, (no read-modify-write)?
yes, bear in mind that predication has zeroing and non-zeroing modes,
so in a SIMT/SIMD architecture, reading and passing the old contents
is going to have to be done anyway (if there's any kind of
register-renaming).

 a DSP-like architecture wouldn't need to.  a multi-issue superscalar wouldn't.

> I picked those particular extension modes as
> they match the existing semantics for sub-word operations in RV
> (NaN-packing (filling high bits with ones) for FP, sign-extension for
> integer).

 ok that bit makes sense... it's just that it'd be a special-case (i
think) or would at least require inter-lane communication in a
SIMD/SIMT microarchitecture, or require a post-processing step.

 think about it: you have a SIMD ALU, each element effectively
separate: *after* the results are calculated, you need to not only
cascade the last element's top sign bit across the other elements, you
need to *identify* that last element in the first place.

 l.



More information about the libre-riscv-dev mailing list