[libre-riscv-dev] [Bug 139] Add LD.X and ST.X? Strided

Wed Oct 9 07:39:03 BST 2019

http://bugs.libre-riscv.org/show_bug.cgi?id=139

--- Comment #51 from Jacob Lifshay <programmerjake at gmail.com> ---
I'll try to explain my reasoning behind why I think DESTSUBVL should not be
calculated from the swizzle as well as why having special ignore/unchanged
settings in the swizzle is a bad idea:

I think we should design the instruction set in such a way as to be fail-safe
as much as possible (similar to safe code in Rust), such that whatever value
gets passed into the instructions will not cause them to write outside the
assigned destination registers or (unexpectedly to the compiler) not completely
write the result registers.

Basically, the instructions won't silently cause undefined behavior (such as
overwriting more registers than allocated). I consider causing a trap in error
conditions as defined behavior, since the program can use that trap to call an
error handler, whereas overwriting near-by registers is silent (in the sense
that it doesn't trap) undefined behavior.

Therefore, since DESTSUBVL changes how many registers are written to, DESTSUBVL
should come from a value that the compiler knows is constant.

the swizzle value is likely to (at some point) come from all the way across the
program, where it's much more likely that there will be a bug/exploit causing
the value to take on invalid values. Having all swizzle instructions check the
swizzle value to ensure it's valid and has the right output size will greatly
improve security/reduce crashes due to buggy code (because it either causes a
trap and is found during development or just changes the result value, not the
result shape)

VL doesn't really have the above problem since VL is the outermost array index
(hence succeeding instructions won't read past the end) and VL will (almost)
always be set using sv.setvl, which has a built-in bounds check to make sure VL
doesn't get set to a value larger than the compiler-allocated size.

ignore/unchanged isn't needed since:

1. [f]swizzle2 suffices to construct values where different elements come from
different input vectors, 3-input swizzle is overkill and, when needed, can be
constructed from 2 [f]swizzle2 instructions in sequence (and so on for 4 or
more inputs).

2. [f]swizzle2 is much more powerful than [f]swizzle with ignore/unchanged,
since it can reorder/duplicate/ignore elements from both inputs, whereas
[f]swizzle can only do that from one input.

3. having the same swizzle format between [f]swizzlei (which most likely only
has 12 immediate bits) and [f]swizzle[2] makes the instruction set more
orthogonal and maybe easier to implement

4. [f]swizzle fits neatly in the encoding space for [f]swizzle2 by using the
case where rs3 == rs1, which wouldn't otherwise be a useful encoding anyway.
(using rs3 == 0 has problems for fswizzle2 due to not allowing f0 as a
register)

5. [f]swizzle[2] without ignore/unchanged can already produce all possible
combinations of inputs with all possible (1 to 4) DESTSUBVL values, so
ignore/unchanged is redundant.

6. Not needing a popcount (even though it's quite small) simplifies instruction
decode

-- 
You are receiving this mail because:
You are on the CC list for the bug.