[Libre-soc-isa] [Bug 560] big-endian little-endian SV regfile layout idea

Thu Dec 31 01:58:55 GMT 2020

https://bugs.libre-soc.org/show_bug.cgi?id=560

--- Comment #24 from Alexandre Oliva <oliva at libre-soc.org> ---
> now the underlying order *does* matter.

exactly!  that's why we're debating the iteration order.

see, you're talking about an array of uint8_t, so let's take it from there.

uint8_t foo[16] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };

; r3 points to foo
  ld r4, 0(r3)
  ld r5, 8(r3)
  setvli r0, 16
  svp64 elwidth_src=8-bit mv r6.s, r4.v  

should r6 be 1, or 8, or should it depend on endianness?

if we're to go by the array layout model in memory, it ought to be 1, which
means that, in big-endian mode, the vector iteration order within the register
should go from most to least significant, whereas in little-endian mode, it
should go the opposite way.  this would maintain the array layout equivalence,
and it makes perfect sense when you think of how bytes are laid out in memory
in each endianness: little endian means least significant first, big endian
means most significant first.  iterating in that order is just natural and
expected.

now, if we were to iterate over sub-register types always from least to most
significant, then we're effectively reversing the order from the expected
memory layout.  IOW, we're visiting first 8, then 7, ... then 1, then 16, then
15, ..., then 9.  are you sure this is what you want?

BTW, should the load sequence into r4..r5 above be equivalent to:

  setvli r0, 2
  svp64 elwidth=64-bit, elwidth_src=64-bit ld r4,0(r3)

and should that really get the reversed byte order in the registers that, in
big-endian mode, you say we'd get with:

  setvli r0, 16
  svp64 elwidth=8-bit, elwidth_src=8-bit ld r4,0(r3)

?

the point is, we have to make a choice here.  do we choose

a) compatibility with the memory/data endianness selected for the system, and
set the iteration order in sub-register vector elements to match, or 

b) a preferential endianness for such vectors that breaks the expected identity
between the two vector loads above, and that forces either

b.1) memory layout of vector types to be shuffled for big endian to match the
reversed layout of register-wise stores and loads (which would make the order
*wrong* for element-wise stores and loads), or 

b.2) the use of only sub-vector element width even for loads and stores, in
big-endian mode, because full-register loads and stores would get them in the
reverse order within each register?

-- 
You are receiving this mail because:
You are on the CC list for the bug.