[libre-riscv-dev] spike-sv non-default element widths

Sun Oct 21 20:53:23 BST 2018

On Sun, Oct 21, 2018, 09:42 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:

> ok so i am just going through the bitwidth polymorphism, and the
> algorithm envisaged (work out rs1-rs2 max bitwidth, perform operation,
> extend/truncate to rd bitwidth, store) is potentially flawed when it
> comes to add-immediate.
>
> immediate operands are 12-bit and sign-extended.  so let's say rs1 =
> 8-bit.  the algorithm says, effectively, that only 12-bit add need be
> carried out.  obviously that's not going to work where rs1 is 127
> (0x7f) and the immediate is 0x3ff (2^12-1), the result will wrap.
>
I'd say that the immediate is truncated to the 8-bit size of rs1 and then
splatted into a vector which is then added to rs1. If there is an overflow,
it just wraps around just like addw does.

>
> RV spec says this about ADDIW:
>
> ADDIW is an RV64I instruction that adds the sign-extended 12-bit
> immediate to register rs1
> and produces the proper sign-extension of a 32-bit result in rd.
>
> morphing that to polymorphic, the bit that can't be morphed is the
> "32-bit" of rd, because where for add and addw it was possible to take
> the size of the add explicitly from RS2, in the case of immediate
> opcodes, that's 12-bit and there *is* no register to specify what the
> width of the result shall be.  saying that the result is to be 32-bit
> (hard-coded) defeats the object of polymorphic widths.
>
> i'm quite tempted to say that the spec be changed so that rd, if its
> bitwidth is over-ridden, uniformly applies across the board: rd's
> bitwidth applies to rs1 (i.e. rs1 shall be sign/zero-extended to rd's
> bitwidth, rs2 likewise *and* 12-bit immediates likewise), however that
> loses out on the important possibility of using the information from
> rs1+rs2's bitwidths to create truncated results that are then
> sign/zero-extended into a larger rd's bitwidth.
>
It just depends on if it's more important to cast after the add or before
the add. I recommend having it be sign/zero-extended before the add as an
explicit cast instruction can be used if after-the-add is needed, whereas
two are necessary if we use visa-versa.

>
> should we just say that for immediates, the bitwidth of rs1 is to be
> used in the operation?  i.e it would go like this:
>
> 8-bit rs1, default-width rs2:
>     8-bit add + truncate(imm, 8), sign-extend to 64-bit
>
> 8-bit rs1, default/2-width rs2 (32-bit)
>     8-bit add + truncate(imm, 8), sign-extend to 32-bit
>
> 16-bit rs1, default-width rs2:
>     16-bit add + sign-extend(imm, 16), sign-extend result to 64-bit
>
> 16-bit rs1, default/2-width rs2 (32-bit)
>     16-bit add + sign-extend(imm, 16), sign-extend result to 32-bit
>
> 32-bit rs1, default-width rs2:
>     32-bit add + sign-extend(imm, 32), sign-extend result to 64-bit
>
> 32-bit rs1, default/2-width rs2 (32-bit)
>     32-bit add + sign-extend(imm, 32), sign-extend result to 32-bit
> (null operation here)
>
> however there's also ADDI (separate and distinct from ADDIW), so some
> of the possibilities are not being covered, i.e. i am concerned that,
> according to the current draft spec, ADDI would end up being *exactly*
> the same functionality as ADDIW, and likewise for ADDID.
>

The *w versions are needed to get 16-bit operations on rv64 and the *d
versions are needed for 128-bit operations since there are 5 sizes
supported (8, 16, 32, 64, 128), but 4 size settings in the csrs (8-bit,
default/2, default, default*2).

>
> apologies, still quite a bit of thought needed here.
>
That's fine.

Jacob