[libre-riscv-dev] Introduction and Questions

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sat May 16 13:11:56 BST 2020

hiya jeremy just coming back to respond to the original message you wrote,
covering things we missed. yes lots of questions, therefore lots of answers

On Friday, May 15, 2020, Jeremy Singher <thejsingher at gmail.com> wrote:

> Hi all,
> My name is Jeremy Singher. I'm a graduate student studying computer
> architecture, interested in open-source hardware and high performance
> microarchitectures. I am looking for an open-source hardware project I can
> contribute to through the course of my graduate studies,

great to hear.  we've established that you know a heck of a lot about
out-of-order design and i could really use some help with the precise
scoreboard system and the FSMs in the Computation Units.

you may have noticed that we have funding from NLNet.  therefore if you
complete a task on the pre-agreed list, you receive a donation from NLNet
for doing so. http://libre-doc.org/nlnet

i.e we are not expecting you to help "for free", ok? :)

> 1. It seems like you guys are building various components of an
> out-of-order microarchitecture, such as the scoreboard, and load-store
> ordering units. Do you have a complete microarchitecture diagram of the
> core (or a text description)?

it is pieces, from notes, often drawn as gate level diagrams.  more get
added as they happen, and they are more aide-memoire that we wish to expand
to full technical design documentation.

example, today, i plan to draw out the register port allocations to the
Function Units.  in the 180nm version there will be at least... 10
different types of FUs, each with radically different register port
allocations (!). i need to document those.

some of them are as high as 5 in, 3 out.  fortunately they are across
multiple different regfiles (we treat SPRs, Condition Regs, INT and FP as
completely different and consequently the FU port requirements are not as
deeply scary as they first sound: we do not need to go above 4R1W for any
one actual regfile)

also in particularly involved cases i have done youtube videos
http://youtube.com/user/lkcl and look at libresoc playlist.

see http://libre-soc.org/3d_gpu/architecture and drill down the levels

 I could find bits and pieces on specific
> components,

this because that is our current focus.

> but I'm interested in details like pipeline width,

 the 180nm ASIC will likely have around... 20 separate and distinct
(parallel) Function Unit entry points into around... 10 to 12 types of

for a 180nm ASIC that is one hell of a lot.

for the Quad Core dual issue version that will jump to around... 28.

> branch-to-branch latency, load-use delay, etc.

we are literally in the process of assembling the pieces, for the 180nm
test chip.  consequently this is becoming apparent as each piece is put

in other words we are right at the point where your help could define some
of those characteristics.

> 2. Is the SoC at a state at which I can evaluate performance on simple
> benchmarks in simulation?

as a python-based project we have a huge number of unit tests, you either
run them directly or you use nosetests3

we are still constructing building blocks (rapidly) so consequently the
individual tests are the focus of developer attention.

>  Other similar open-source hardware projects have
> make targets to launch verilator simulations, but I could not find an
> equivalent in your repos (although I probably am just bad at looking).

those will be added when compilation of the whole chip is ready to do so.
FPGA targets etc. these will indeed go at the top level.

however as we use a python-based HDL, python practices apply and that
typically means "python3 setup.py test" or "nosetests3" more than it means
"make test" although that i think was added into the soc top level Makefile.

> 3. What is your target performance in terms of an established benchmark
> like Coremark, Dhrystone, Embench, or SPEC?

really hard to tell right now. the primary focus is more on designing
something that is GPU Capable.  that is such a high performance and data
requirement that i think high SPEC etc figures will inherently drop out.

>  I'm trying to compare the
> merits and progress of various hardware projects out-there.

this one is... the reasons why we are all here are not technical ones,
although it is fascinating but deeply challenging work.

the driving reasons are that there are no SoCs out there at this level of
dedign capability (quad core dual issue SMP and 3D and VPU, and there is no
reason why we should not try quad or 6 issue, and 1.5 ghz and above, later)
that have a full Libre Software stack, intended right from the start for
100 million and above mass volume end user markets.

however even that masks a simpler way to put it:

we're giving citizens back Sovereign control over their computing devices.

if that sounded "merely" cool about 4 months ago, you will be keenly aware
with a little thought and analysis of the world news that that is now not
just cool, it is deeply *urgent*.

no other Libre project is creating an end-user product with a built in
*Libre* GPU and VPU for use in mass volume products.

* SiFive joined forces with ImgTec to use PowerVR. why on earth they did
this when ImgTec has such a dreadful reputation we will never know.

* MIAOW is a SIMT engine, not an actual GPU. there are no Texture opcodes,
nothing like that: they implemented a very small subset of the AMDGPU ISA,

* Nyuzi is an academic exercise demonstrating why Larrabee was not adequate
as a GPU. it was highly successful in this regard.

* GPLGPU is not actually GPL licensed (its author augmented the GPL, making
it a proprietary license), and it is based on PLAN9 which was a fixed
function (non-shader) design concept anyway.

therefore, even a Libre *GPU* does not exist that can meet commercial
power-performance expectations of today's markets.

> 4. What is the right way to get started contributing?

next step is probably to read http://libre-soc/HDL_workflow to get a feel
for how to get started, what the interaction and development flow is.

>  My experience is with
> Verilog, and I've looked at other languages too, including BSV and Chisel.

just follow the process step by step on the front page.  you did step 1 :)
"join list, say hello".

if you are not immediately familiar with nmigen there are tutorials the
best one is by robert baruch

from there we're happy to give you a small (useful) task, like this:

even just reading that, you will see some of the gotchas to expect.

btw, Cole a few months ago knew virtually nothing about software
development, or linux, or hardware, or python. or git.  for a total
beginner and i do mean total, he's doing really well. i'm kinda seriously

I'm primarily interested in developing microarchitecture for
> performance-critical components, like branch predictors, prefetchers,
> instruction schedulers, and load-store units.

goood.  because those are exactly the things that we need help with, quite

Ideally, I could contribute
> my work as part of graduate studies to this project.

fantastic.  remember: and receive donations from NLNet for doing so :)


crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

More information about the libre-riscv-dev mailing list