[libre-riscv-dev] Vulkanizing

Wed Feb 19 05:07:09 GMT 2020

On Tuesday, February 18, 2020, Scheming Pony <scheming-pony at protonmail.com>
wrote:

>
> Thanks, that helps clarify it.   I am still unsure about (1) where on the
> prototype and (2) subsequently on the production ASIC the Mesa driver is
> going to be run

on the hybrid CPU-VPU-GPU.

http://libre-riscv.org/3d_gpu.

there is no separate GPU and separate CPU.

there is only one CPUGPUVPU.

there are no separate pipelines for CPU and GPU.

there are no separate caches.

this means that COS and SIN and ATAN2 etc etc are *actual assembler
instructions*, and they are *on the CPU*, as an *actual* CPU opcode.

thus, basically, the MESA driver which is in c++ which is compiled to POWER
assembler will take Vulkan shader programs writtem in SPIRV and compile
them JIT style *at runtime*...

... *into POWER ASSEMBLER*.

that POWER assembly code will on Phase 2 happen to have unusual opcodes
such as "completely new ATAN2 opcode" or "completely new 'YUV2RGB opcode".

for Phase 1 we do not even want that.  we LITERALLY want the MESA driver to
LITERALLY compile the Vulkan SPIRV to native assembler with no efforts made
at any kind of optimisation.

this to be done by handing things over to LLVM JIT and telling it to get on
with it.

for convenience we actually want that working first on x86, because it us
easier to test.

.  Could someone clarify this?  Sorry, I am just starting out here.
>
> At a high level though, isn't there going to have to be some engine (e.g.
> Godot, scene graph) for application developers (mere mortals)?  Will that
> overhead yield decent performance with your design (assuming Vulkan has
> decent performance)?  Are individual graphics developers really learning
> Vulkan, I have heard *not*.

if they are screaming rabid performance fanatics as in the game industry
yes.

for everyone else they will go via one of the compatibility APIs which *we
are not writing*.

State of the art GPU graphics and general programming is kind of a
> nightmare, IMHO--incompatible drivers, hardware requirements, etc.  It's a
> lot of overhead when trying to solve a problem

>
the reason for that godawful mess is down to the RPC marshalling and
unmarshalling over IPC buses, all of which has to go via kernelspace.

it is ridiculous and quite insane and we are doing none of it.

when we want a cosine result we LITERALLY call the cosine frickin assembly
opcode, right there, right then.

no pissing about marshalling up a cosine RPC function request which goes to
kernelspace, kernel sends over IPC to GPU, GPU executesthe instruction then
pisses about unmarshalling the result RPC call, does the instruction then
marshalls the result *back* down the same stupid process.

> As a closing thought, at the modeling (and rendering) level many of us are
> trying to get away from triangles.

then this processor will be a heck of a lot simpler basis to start that
kind of experimentation.

and if you find you need a special instruction in hardware it will be far
simpler to try it out.

  There is a technique called f-rep (which my project Tovero and others
> use) which uses signed distance fields.  Currently, we are using a
> technique to generate an isosurface of triangles in the CPU (dual
> contouring), then pushing the triangles to the GPU.  F-rep has the
> potential to generate "Turing complete" shapes, if that makes sense.  There
> is the concept of rendering them directly on the GPU (e.g. sphere tracing),
> but also doing engineering analysis like FEM on the GPU (or other
> co-processor) using AD.

very cool.

-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68