[libre-riscv-dev] Vulkanizing

Luke Kenneth Casson Leighton lkcl at lkcl.net
Wed Feb 19 06:48:44 GMT 2020


On Wednesday, February 19, 2020, Jacob Lifshay <programmerjake at gmail.com>
wrote:

> On Tue, Feb 18, 2020, 22:10 Immanuel, Yehowshua U <yimmanuel3 at gatech.edu>
> wrote:
>
> > I guess I should say, how many SIMD lanes do we have…
> >
>
> The current plan is to have the main (mul-add) SIMD ALU be 128-bits wide
> per-core.


mmm.. dual issue 64 bit, so effeeectively 128 bit.  however it is because
of the twin 64 bit ALUs.

one of those ALUs will be dedicated to odd numbered registers the other to
even.

within those, there will be a 32 HI and a 32 LO split on the regfile

this will allow us to use 3R1W SRAM on the regfile instead of absolutely
mental 12R6W you see on many modern processors.

this is a major power drain and we do not have it.

it does mean that scalar instructions issued on reg1 then reg3 then reg5 7
9 11 will execute at single issue rates.

however we are optimising for *vectors*.  so we do not care that scalar
slows down.

therefore Kazan will output assembly that uses the odd even sequential
elements.

duh.



That is 8 flops/core/cycle of fp32, 16 for fp16, and, depending
> on how we implement it, either 2 or 4 flops/core/cycle of fp64.


>
2 because ... no 4 if you count FMAC as 2, and we can do 2 per clock @ 64
bit.

the odd ALU will do 2FMAC FLOPS @ 64 bit, the even likewise.


 There are
> also other ALUs for div (int and fp), sqrt, rsqrt, and other special
> functions, so those will help increase performance too.
>
> >
> > And guys, to be honest, I have a good feeling about this.
> >
> > Our design should be SUPER easy to scale to something like a server scale
> > processors.


yyup.

SMP.

done.

Provided we solve the interconnect problem that accompanies
> > scaling…


OpenPITON.

half a million cores.

done.



-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


More information about the libre-riscv-dev mailing list