[libre-riscv-dev] [Bug 139] Add LD.X and ST.X? Strided
bugzilla-daemon at libre-riscv.org
bugzilla-daemon at libre-riscv.org
Sun Oct 6 09:39:59 BST 2019
http://bugs.libre-riscv.org/show_bug.cgi?id=139
--- Comment #42 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #37)
> (In reply to Jacob Lifshay from comment #36)
>
> > > can you please describe it in pseudocode in a simple loop?
> >
> > sure.
>
> ah superb. now i get it.
>
> >
> > let mut sel_field = selector >> (FIELD_SIZE * i);
> > sel_field &= FIELD_MASK;
> > let src = if (sel_field & 0b100) != 0 {
should have been == instead of !=
>
> This is the bit where I was talking about "if sel_field != 0b111" (to
> represent "masked out" / "ignore").
yeah, it conflicts with rs3.w for swizzle2 and with constants needed for
normalized integers for swizzle.
>
> > rs1
> > } else {
> > rs3
> > };
>
> After this if you have "if rs1 == x0 continue" then swizzle may be
> implemented as simply setting rs3 to x0
that only works for swizzle, not for fswizzle, unless you think having a
special case for fp reg 0 is acceptable.
> destsubvl must *not* be a CSR, it can be a fmt3 subencoding.
yup, thinking of putting it in the bits left over in funct7 or funct3.
>
> This does however leave out being able to do setting to 0, 1 or other
> constants. I wonder why other ISAs do not have the constants as part of the
> ISA.
>
> OH hang on. If SUBVL is to be ignored on rs2 (special treatment) we could
> also hypothetically use (set) the predicate on rs2 as the rs1/rs3
> sub-element selector mask.
the cleaner way to do that is that rs2 in scalar mode is the same 12-bit
selector for every subvector. if different selectors are needed for every
subvector, then rs2 can be a u16 vector.
having it be a simple u16 vector is much better than trying to pack it in a
u64, since the selector is likely to need to be calculated per-VL-index.
>
> The only downside of that being that it is limited to 64 bits, so 64/SUBVL
> is 16 when SUBVL is 4, you can only cover up to 16 long VL.
>
> More special treatment, limited range. Not looking attractive. 3 bits for
> the rs2 selector is much better.
>
> L.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-riscv-dev
mailing list