[libre-riscv-dev] sv 1d/2d/3d data shaping
lkcl
lkcl at libre-riscv.org
Tue Oct 16 00:11:32 BST 2018
On Mon, Oct 15, 2018 at 10:46 PM lkcl <lkcl at libre-riscv.org> wrote:
>
> On Mon, Oct 15, 2018 at 10:17 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
> >
> > That should work. For a 4x4 matrix (the biggest required) it could be set
> > up to have a 4x4xN layout, where N is the number of shaders running at once.
>
> okaay very cool. i'll write something up.
this flexible algorithm (below) seems to do the job. hypothetically
it can be extended to N dimensions. the "order" may be specified as a
permutation, which, for a set of 3 elements is six combinations, for a
total of only 3 bits to express it:
012 021
120 102
201 210
with 32 bits to play with, and a hard limit of XLEN, the dimensions
may be expressed in 6 bits, as (x-field + 1), (y-field + 1), (z-field
+ 1) and of course any of those equals zero, that means that dimension
equals 1 (linear) so is effectively disabled, and in this way we get
2D (and 1D obviously).
interestingly if the xdim * ydim * zdim is *less* then VL, then the
whole thing wraps round. i can see that actually being extremely
useful, to apply values in a matrix repeatedly to an *array* of
matrices. or, to have a single instruction issue multiple adds of a
larger array to a smaller one, in a cumulative fashion. map-reduce in
other words.
obviously have to be be really really careful there, because if the
wrap is too small it could interfere with pipeline issuing / register
allocation...
still, extremely cool.
----
xdim = 3
ydim = 4
zdim = 5
xmul = xdim
ymul = ydim * xdim
zmul = 1
lims = [xdim, ydim, zdim]
idxs = [0,0,0]
order = [1,2,0]
for idx in range(xdim * ydim * zdim):
new_idx = idxs[0] + idxs[1] * xdim + idxs[2] * xdim * ydim
print new_idx,
for i in range(3):
idxs[order[i]] = idxs[order[i]] + 1
if (idxs[order[i]] != lims[order[i]]):
break
print
idxs[order[i]] = 0
More information about the libre-riscv-dev
mailing list