[libre-riscv-dev] Yehowshua - Interested in open GPU dev

Luke Kenneth Casson Leighton lkcl at lkcl.net
Wed Jan 1 04:10:30 GMT 2020


On Wednesday, January 1, 2020, Immanuel, Yehowshua U <yimmanuel3 at gatech.edu>
wrote:

> Hello guys and gals,
> I’m Yehowshua, a master’s student at Georgia Tech. I have a good amount of
> experience in HDL. I am currently implementing a discrete Machine Learning
> Convolutional Neural Network accelerator for my masters thesis.


nice.

hm, if there is any relevance to robotics there i may be able to introduce
you to someone whom you could collaborate with on an NLNet funding
application.



>
> I also have a MacSE for which I intend to implement an FPGA-GPU that sits
> on the Mac’s NuBus.
>
> I also currently contribute to Migen, nMigen, and LiTex.
>
> Here is my website:
> https://yehowshuaimmanuel.com
>
> Here is my GitHub:
> https://github.com/BracketMaster
>
> I would like to work on the OpenGPU.
> Do we have an architectural layout for the GPU yet


effectively, yes.  more below.


> or a codename?


not really :)

And lastly, I would like to find out about funding.


ok, the way it works is, you complete a milestone (including help define a
subtask) and on 100% completion you submit a rfp with my approval, it
includes the budget amount and your bank details, and NLNet transfers the
money in about 2 weeks.

please do note, there: *i* do not pay you, it is direct from the NLNet
Charitable Foundation.


the architectural layout, we chose to do a highly augmented variant of the
CDC 6600 Cray Supercomputer. with help from Mitch Alsup, we know how to do
multi issue and it is dead simple thanks to a bitlevel (unary) dependency
matrix.

i put some youtube videos up online about it and also there are some
unpublished updates.
https://git.libre-riscv.org/?p=crowdsupply.git;a=tree;f=updates;

the register file is dynamically subdivided into subparts which are handed
to AUTOMATIC partitioning ALUs.  this saves a massive amount of silicon
area.

the first autopartitioning ALU unit is the int multiplier.  it can do 8x8
MULs, 4x16 bit, 2x32 bit or 1x64 bit, lo-lo, hi-lo, hi-hi and
signed/unsigned.

one major task is to create a full suite of other ALU operations, right
down to even comparators which, when the "gates" between partitions open
up, will do 8x8bit GTthan, 4x16bit GT, 2x32bit GT and finally 1x74 bit GT.

clearly bitwise operations such as XOR and AND will be trivial, there.

once we have those basic primitives, we will drop them into the FPU
codepath using OO code that has been designed *in advance* to use such
partitioning code, and we will have a full IEEE754 autopartitionable FPU
with very little in the way of extra code and a hell of a lot less gates.

that is something that, if you were interested, could be started straight
away.  it would be nice to see if tobias would like to help on that, too,
what do you think, tobias?

best,

l.



-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


More information about the libre-riscv-dev mailing list