[libre-riscv-dev] sv.setvl encoding

Fri Jun 28 07:27:16 BST 2019

On Thu, Jun 27, 2019 at 8:54 PM Luke Kenneth Casson Leighton
<lkcl at lkcl.net> wrote:
>
> On Thu, Jun 27, 2019 at 11:38 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
> >
> > I added the encoding I had originally planned under the "original encoding"
> > header.
> >
> > https://libre-riscv.org/simple_v_extension/specification/sv.setvl/
>
>  ok so SV is never going beyond RV64 without a major redesign as VL
> and MVL are compacted into 6 bits (RV128 has XLEN=1<<7).
Actually, I think that STATE can be refactored a bit to work just fine
with longer VL values:

So, here's the list of STATE fields from the current version of the SV spec:
* MVL -- keep
* VL -- keep
* destoffs - the destination element offset of the current parallel
  instruction being executed
  -- keep
* srcoffs - for twin-predication, the source element offset as well
  -- keep, though can be removed since it can be recomputed (slowly)
from destoffs since the instruction being executed has known
predicates and since the microarchitecture can cache the computed
value in a register that is not programmatically visible, the cache
can be invalidated on write to the STATE register by any means other
than normal counting through elements in the process of executing
vector instructions.
* SUBVL
  -- keep
* svdestoffs - the subvector destination element offset of the current
  parallel instruction being executed
  -- keep
* svsrcoffs - for twin-predication, the subvector source element offset
  as well
  -- remove. not needed since predicates operate on whole subvectors,
not individual subvector elements, therefore svdestoffs can be used.
For swizzle operations (and other similar operations) the source
subvector index is looked up from the dest subvector index in the
instruction immediate anyway.

So, counting the number of bits needed, using VLLEN as the number of bits in VL:
* MVL: VLLEN
* VL: VLLEN
* destoffs: VLLEN
* srcoffs: VLLEN
* SUBVL: 2
* svdestoffs: 2
* svsrcoffs: 0 (removed)

Adding them all together, we get VLLEN * 4 + 4. The largest value of
VLLEN that can be used on RV32 is 7. For RV64, VLLEN can be further
increased to 15, which is more than sufficient.

Since MVL, VL, and SUBVL (the commonly programmatically accessed
fields of STATE) have separate CSRs that are aliases of the correct
fields from STATE, the most (only?) common use of programmatically
accessing STATE directly is for context switching (either
intra-/inter-function or between threads/processes), where the actual
field locations don't matter, I think splitting the VLLEN-bit wide
fields into the lower 7 bits (which go in the lower 32 bits of STATE)
and the upper bits (which go into bits 32-63 of STATE) would be a good
idea.

I propose using the following layout:

RV32, RV64, and RV128:
bits 0-6: lower 7 bits of VL
bits 7-13: lower 7 bits of MVL
bits 14-15: SUBVL
bits 16-17: svdestoffs
bits 18-24: lower 7 bits of srcoffs
bits 25-31: lower 7 bits of destoffs
RV64 and RV128:
bits 32-39: upper 8 bits of VL
bits 40-47: upper 8 bits of MVL
bits 48-55: upper 8 bits of srcoffs
bits 56-63: upper 8 bits of destoffs
RV128:
bits 64-127: reserved (must be 0)

>
>  so, the field named MAXVL, practically speaking, need only be 6 bits, not 11.
>
>  the MAXVL field fits across the rs2 field rather than the top
> immediate bits (normally a funct5/6/7).  which would complicate
> brownfield encoding that wanted an rs2.
it's the opcode and funct7 that are actually used to determine the
instruction for almost all RISC-V instructions, therefore, I think we
should use the lower bits of the immediate in I-type to encode MAXVL.
This also has the benefit of simple extension of VL/MAXVL since the
bits immediately above the MAXVL field aren't used. If a new
instruction wants to be able to use rs2, it simply uses the encoding
with bit 31 set, which already indicates that rs2 is wanted in the V
extension.