[libre-riscv-dev] register requirements of SimpleV
lkcl
lkcl at libre-riscv.org
Wed Oct 10 21:57:39 BST 2018
On Wed, Oct 10, 2018 at 9:45 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> Note that the semantics implied by the Vulkan standard is SIMT, so if I
> were to translate that directly to SV, the inner loop would be the entire
> shader.
bit of cross-over, so have a look at what i just wrote about
subsetting the CSR table by being able to flip between 4 different
STATEs.
> After some more thought, a solution that doesn't involve as much change to
> SV as supporting grouped elements would be to use struct-of-arrays
> representation instead of array-of-structs representation. That will lead
> to inefficiencies when VL is low (reverting to scalar operations for
> fixed-length vectors as they effectively have 1 group, which we might want
> to special-case in the compiler).
i'm not totally following the reasoning, here, i'd probably need to
see some pseudo-code, perhaps on that same page as before?
> The reasoning for the sign-extension/1-extension is to allow
> register-renaming implementations to not need an additional source register
> for the old contents of rd.
oh, so no need to "read" rd, pass it in, (no read-modify-write)?
yes, bear in mind that predication has zeroing and non-zeroing modes,
so in a SIMT/SIMD architecture, reading and passing the old contents
is going to have to be done anyway (if there's any kind of
register-renaming).
a DSP-like architecture wouldn't need to. a multi-issue superscalar wouldn't.
> I picked those particular extension modes as
> they match the existing semantics for sub-word operations in RV
> (NaN-packing (filling high bits with ones) for FP, sign-extension for
> integer).
ok that bit makes sense... it's just that it'd be a special-case (i
think) or would at least require inter-lane communication in a
SIMD/SIMT microarchitecture, or require a post-processing step.
think about it: you have a SIMD ALU, each element effectively
separate: *after* the results are calculated, you need to not only
cascade the last element's top sign bit across the other elements, you
need to *identify* that last element in the first place.
l.
More information about the libre-riscv-dev
mailing list