[libre-riscv-dev] Request for input and technical expertise for Systèmes Libres Amazon Alexa IOT Pitch 10-JUN-2020
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Jun 8 16:27:46 BST 2020
(btw thank you staf for the insights)
On Mon, Jun 8, 2020 at 2:56 PM Hendrik Boom <hendrik at topoi.pooq.com> wrote:
> On Mon, Jun 08, 2020 at 12:13:48PM +0100, Luke Kenneth Casson Leighton wrote:
> > all of these things are at the architectural level. we are not doing
> > anything fancy at the gate level. it is a matter of making
> > *architectural* decisions that reduce power consumption, operating
> > within the exact same gate-level power consumption constraints as
> > every other ASIC out there.
> You point out that the CPU and GPU share cache, being the same processor.
yes. or: the CPU instructions and GPU instructions, by being in the
same ISA, the GPU *workload* will push CPU workload(s) out of the
(same) L1 Cache.
> But we are designing a four-core chip?
yes. therefore there will be 1x L1 Data and 1x L1 Instruction Cache
per each of those four cores.
> To what extent to the four cores share cache?
L1? not at all - ever.
> And on avoiding data copying between CPU ad GPU:
> I believe the OpenGL API involves copying data from CPU buffers to GPU
> buffers, with the understanding that the CPU copies can be discarded
> while the GPU goes on with its copy.
... because the assumption is that the GPU is a completely and utterly
separate processor, the "command" to perform that copy is expected to
involve the excruciatingly-painful process previously mentioned, from
which i excluded the userspace-kernelspace context-switching so as not
to have people run away screaming in terror.
in our case: it would simply be... a memcpy.
> Having the same storage for both sets of buffers could obviously obviate
> these copies, except that software that uses this API will likely rely
> on being able to overwrite the CPU-side buffers with impunity. So the
> copy will still have to be done.
sounds reasonable to me. actually now that i think about it, if the
buffer is placed into a shmem segment with copy-on-write semantics,
the memcpy will not be needed, and the "overwriting", because of the
CoW semantics, would only be done on-demand.
this however would be an optimisation.
the *only reason* that we can even remotely consider such an
optimisation is precisely because of the hybrid architecture.
> Do I misunderstand OpenGL? Is Vulcan different?
> Will users want to bypass these libraries and use the graphics instructios directly?
only if they want to become an assembly-level expert, with all the
inherent implications and performance-complexity tradeoffs that always
come with doing assembly programming.
> Or is there some other sublety I'm missing?
no idea :)
More information about the libre-riscv-dev