[libre-riscv-dev] whole stack of vulkan llvm spirv stuff

Fri Sep 13 13:01:47 BST 2019

On Fri, Sep 13, 2019 at 11:58 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Fri, Sep 13, 2019, 03:56 Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> > On Fri, Sep 13, 2019, 03:38 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> > wrote:

> > the issue is that the shaders are still scalar in the sense that SV is
> > scalar if VL (but not SUBVL) were hardwired as 1. so, it supports short
> > vectors (which are used for 2d, 3d, and 4d coordinates rather than for
> > parallelism), but doesn't do whole-function vectorization with all the
> > extra code to handle converting branching into predication. the
> > whole-function vectorization is what I've been building into Kazan's spir-v
> > translator.
> >
>
> in amdgpu, that whole-function vectorization is done in the amdgpu backend
> as part of translating llvm ir to gpu machine code.

and that's exactly where i would expect it to be done - by LLVM, *not*
by a spir-v compiler.

the reason is very simple: if it's taken care of by the llvm ir to cpu
/ gpu machine code, that exact same code can then have the standard
clang, or other frontend put onto it, and the effort of performing the
whole-function vectorisation is *not* "wasted" [i.e. local and
exclusive to kazan].

in addition, from what i can see, it's what everyone else is doing.
ARM adds NEON (and the new vectorisation), AMD adds AMDGPU, intel adds
SIMD and AVX512 (etc.) - they're all collaborating on *LLVM*
vector-assembler translation.

that in turn means that there will be overlap areas where phases will
be created which turn LLVM IR into whole-function vectorised LLVM
IR... *in the LLVM codebase*.

apologies i did not realise - at all - that you intended to take the
route of putting the vectorisation into kazan's spir-v translator,
otherwise i would have raised it... eight months ago.

so, just to check: is there anything about SV (or SPIR-V) that makes
it "impractical" to not follow what every single other CPU / GPU with
SIMD / Vectorisation is doing, which is to put the whole-function
vectorisation into the LLVM-IR to assembler translator?

l.