[libre-riscv-dev] Instruction sorta-prefixes for easier high-register access

Thu Jan 31 08:09:15 GMT 2019

On Wed, Jan 30, 2019, 23:18 Luke Kenneth Casson Leighton <lkcl at lkcl.net
wrote:

> Ok so moving on to scalar-vector, in SV original, a bit in the CSRs
> specifies whether the register is scalar or vector, and 2 more bits specify
> the elwidth override.
>
> When elwidth is overridden, even for scalar ONLY the parts of the physical
> regfile up to the elwidth are read (or written).
>
> So if elwidth is 8, only the LSByte of the regfile record for that register
> is read/written.
>
> If however elwidth is default, and LD.B is used, you get the standard
> behaviour: 1 byte read but it is zero-extended to 64 bits.
>
> We need some rules for SVprefix, in the extremely limited available bits.
>
> We so far agree that 1 bit be used as a prefix to regnums. 0 means scalar,
> can't recall if that means x0-x31. 1 means vector, with bottom 2 bits being
> 0 and next 5 bits being the rs/rd 5 bits.
>
I had had the scalar/vector bit inverted, but that doesn't matter. Scalar
does mean x0-x31 or f0-f31. I think we should treat x0 (but not f0)
specially to mean all zeros, so even if rs1 is a vector, rs1 being x0 would
mean that the input was all zero.

>
> Elwidth to be taken from standard RV OP, no problem there.
>
> However we need to define whether the scalar elements should be zero/sign
> extended or if they should be compressed together, and likewise for vector.

I suggest compressed when vl-mul isn't 1, otherwise
zero-extended/sign-extended/nan-boxed.

so for scalar with vl-mul=1:
i8: zero extended (u8 is more common than s8)
i16: sign extended
i32: sign extended
i64: sign extended
f16/f32/f64: nan-boxed

we could also use some method to encode sign/zero extension for scalar
vl-mul=1 results (have 2 vl-mul=1 encodings in vlp?).

for scalar with vl-mul > 1:
packed like vector with same vl-mul and VL=1, but the padding from the end
of the vector to the end of the register should be filled with
zeros/sign-extended (not recommended)/nan-boxed:

so:
li x12, 0x0123_4567_89AB_CDEF
li x16, 0x0182_0304
li x20, 0x1122_3344
add.b.sss x12, x16, x20, vl-mul=3
sets x12 to:
0xA4_3648 for zero extension
0xFFFF_FFFF_FFA4_3648 for sign extension (not recommended)
adding the 3 lsb bytes and sign or zero extending the result
note that x13 is not modified

I think sign extension of vectors is too expensive (need to extend from
every byte) so I recommend requiring zero extension/nan-boxing for vl-mul >
1.

basically it's as if vl-mul > 1 makes vectors-with-length-VL/scalars of
vectors with length vl-mul:
Pseudo LLVM IR:
<VL x <vl-mul x float>>

> Or, if that extra elwidth bit (or two) is needed.
>
Since we can't use OP/OP-32 as 1 bit of elwidth (because of possible future
standard extensions that use OP/OP-32 to change more than operation
bit-width), I think we will need 2 elwidth bits.

>
> What I would like to advocate is that scalar regs not be altered, whether
> src or dest, from standard RV behaviour.
>
> And that it is Vector regs (when the reg prefix bit is 1) that have the
> altered width behaviour.
>
> So, a FP16 FADD of a prefixed-scalar to a vector would, if stored in a
> scalar x1-x31, result in NaN boxing to the full 64 bits, however if the
> dest was a Vector it would NOT be boxed, only the actual FP16 would go into
> the regfile, NOT setting an additional 48 bits to all 1s.
>
For vector rd, I agree that elements past VL should be left unchanged. It
will make it more difficult for register-renaming/tomasulo implementations
though since they will need to read from rd for unpredicated cases (they
need to read from rd for predicated cases anyway).

>
> Thoughts?
>
>
> --
> ---
> crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
> _______________________________________________
> libre-riscv-dev mailing list
> libre-riscv-dev at lists.libre-riscv.org
> http://lists.libre-riscv.org/mailman/listinfo/libre-riscv-dev
>