[libre-riscv-dev] [Bug 139] Add LD.X and ST.X? Strided

bugzilla-daemon at libre-riscv.org bugzilla-daemon at libre-riscv.org
Fri Oct 4 12:41:29 BST 2019


http://bugs.libre-riscv.org/show_bug.cgi?id=139

--- Comment #24 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #23)

> > more in a mo - twin-SUBVL doesn't sound right.  SUBVL is intended to
> > be applied globally.  the CSRs would need a total redesign to cope.
> 
> I had always intended SUBVL to vary from value to value, even faster than VL
> would vary.

Varying is not a problem at all. Having *two* SUBVLs (or even worse three), one
for src1, one for src2 and another for rd, we're into major redesign territory.

The example there, all of the src and dest vectors are all vec4 ie SUBVL=4.

However what is missing is a per SUBVL-element predicate mask, and if you
recall we specifically designed the SUBVL predication to apply the predicate
bit to the whole group.

Without going into redesigns, the solution would be to ensure a full vector is
copied.

In the example given, it turns out that the first two parts of the colour come
from line 34, and the last two from line 35.

If that is not done, then by way of various passes I would expect that the
elements be copied by non-SUBVL methods (using VL and predicate masking)
followed by a swizzle copy that placed the one, two, or three unaltered
elements into the dest.

OR...

This is perhaps what "identity" is for.

If identity is intended to mean that the indexed subelement is unaltered, we
have a way to leave xy alone:

Col.zw = srccol.xy

Becomes

Swizzle Col, srccol, {identity, identity, x, y}

Meaning:

* leave col.x untouched
* leave col.y untouched
* set col.z to srccol.x
* set col.w to srccol.y

A separate pass would notice the identity  overlap with line 34 and combine
them, but that is a different story.

Bottom line, think it through from the normal SIMD perspective that other GPUs
use, they just simply don't have the capability to do mixed vec2,vec3,vec4
operations, they are all vec2 only, vec3 only or vec4 only.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-riscv-dev mailing list