[libre-riscv-dev] [isa-dev] 3D Matrix-style operations / primitives

lkcl luke.leighton at gmail.com
Wed Sep 18 08:00:56 BST 2019


On Wednesday, September 18, 2019 at 2:28:38 PM UTC+8, Jacob Lifshay wrote:


> the algorithms I've generally used are an unrolled form of gauss jordan elimination or just using the formula from a symbolic math program [1], at which point, like operations can be grouped together.
> 
> 
> [1]:
> type the following into maxima:
> m:apply(matrix, makelist(makelist(concat(v, i, j), j, 0, 3), i, 0, 3)); grind(invert(m))$


https://brilliant.org/wiki/gaussian-elimination/

That looks very much like the equation substitution system i was taught at A Level.  Multiply both sides by blah blah, substitute or subtract and eliminate blah blah.

Neat. Of course it applies to 3x3 

> 
> I've not generally had matrix inverse on a fast path, instead passing a matrix and it's inverse together if needed and only calculating the inverse when I generate the matrix.
> 
> 
> For most 3D programs, matrices are used much more than they are generated, so matrix inverse shouldn't generally be in the fast path, if the program is designed well.

The idea from SIGGRAPH2019 by Pixilica, the driving force, is to go into long-tail applications and thus design something that is useful for innovation where mass volume GPUs have suboptimal performance, legacy issues precisely because of their profit driven focus on highest volume makets.

Where one of these algorithms (determinant ON^3) is so long, I would be concerned even despite their potential appeal, without some form of CSR FSM reentrant state, and even then it makes me nervous.

The compromise is to in great detail the best algorithms then target the ISA at parallellisable operations that will reorder the data.

A year ago I did design a REMAP system for SV which allowed up to a 3D permutation of elements.

I am however tempted to suggest just using MV.X regs[rd] = regs[regs[rs1]] as as means to arbitrarily reorder data.  Precompute a transposition set of indices and let a MV take care of them.

Likewise for determinant, MV can be used to extract the elements to reorder them into petmuted pairs that would allow them to be multiplied as 2 vectors, followed by the subtractions as a second set.

Likewise for Inverse.  However MV.X is pretty expensive.

I wonder there if an answer might be to permit the rs1 vector elements to be typecast to elwidth of 8? More to the point, SV already has that capability... or does it...

No I don't think it does.  The assumption is that the element indices are always XLEN wide and it is the data to which elwidth applies in the src.

Have to fix that by adding a fmt field to MV.X let me just update the page


More information about the libre-riscv-dev mailing list