[libre-riscv-dev] 53000 as ME for libre-riscv - bus interfaces?

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sun May 5 04:31:23 BST 2019


(cc'ing off-list discussion to list with samuel's permission)


On Sat, May 4, 2019 at 2:22 AM Samuel Falvo II <sam.falvo at gmail.com> wrote:
>
> On Fri, May 3, 2019 at 2:35 PM Luke Kenneth Casson Leighton <lkcl at lkcl.net> wrote:
>>
>> Do you have some nmigen code kicking around?
>>
>> My biggest concern with TileLink is that it was the Berkeley team that designed it, and they *specifically* did so from the OO perspective of chisel3. Which means, if you do not use OO design techniques, you are screwed.
>>
>>  People who are actual experienced hardware engineers, used to the direct verilog approach, go "holy f***" this is complicated.
>
>
> I have slave-side TL-UL logic implemented for some of my peripherals.  The ROMA and SIA cores both use the TL-UL subset.  See http://chiselapp.com/user/kc5tja/repository/kestrel-3/dir?ci=51b29796b766eee7&name=cores/sia for the SIA core, and http://chiselapp.com/user/kc5tja/repository/kestrel-3/dir?ci=51b29796b766eee7&name=cores/roma for my ROM Adapter (ROMA) core.  I didn't find implementing them to be onerous.  I did find reading the specifications to be rather daunting, but once you work through what you *actually* need to make a functional yet compliant interface, it wasn't *too* much worse than Wishbone in the end.  It does use more wires though.  I suspect AXI4 is comparable.
>
> These cores are implemented in pure Verilog, as I completed them long before I discovered nmigen.

 ok, so that would mean not being able to work within the nmigen
simulation environment: it would be necessary to compile nmigen to
verilog, and use e.g. cocotb or other testing framework.

>> Sounds familiar. Am looking to design a bottle-buffer block to help deal with this. Pair of Shift registers probably, parameterised on input and output.
>
>
> If this will help in any way, check out the Kestrel-2DX project's processor logic.  It is not pretty code, but it is functional.   https://chiselapp.com/user/kc5tja/repository/kestrel-2dx/dir?ci=2d8d4695945237d7&name=rtl/kcp53000 specifically in bottleneck.v and BottleneckSequencer.v.
>
>> Hypothetically, with python OO, the task is a lot easier.
>
>
> This is why I'm most attracted to nmigen right now.  I would love to have a single source tree where I can say,
>
> * for Kestrel-3 on iCE40HX, I want a 3 CPI processor with U+M modes but no paging support.
> * for Kestrel-3 on ECP5K family, maybe I can get away with a 3-stage pipelined processor with software-managed MMU.
> * for Kestrel-3 on Terasic DE-1, I have room to spare for 5-stage pipelined RV64GC with hardware PTW.
>
> I'm day-dreaming at this point, I feel that nmigen at least gives me the tools to be able to daydream in the first place.

 i'd agree.  the fact that it's possible to have *python* make some
decisions (dynamically), completely changing what HDL is outputted,
and to be able to do so in an OO fashion, that's... yeah.

>> Really, the LD/ST should not be tightly coupled to the Bus Architecture anyway. That is just good design.
>
>
> Perhaps later on; however, right now, IFU, LD, and ST are literally the only things that touch the bus in any capacity.  I do not have caches, page table walkers, etc.  So, from a minimum resource consumption POV, this might make the most sense for me right now.

 if you intend for other people to use the design, then their
practical decisions need to be taken into consideration.

 so, if you set TileLink as the sole exclusive bus, what are they
going to do when it comes to evaluating the Kestrel designs?  they
will do one of these things:

 * google "Tilelink verilog" and, on finding that nothing exists, walk away.

 * google "Tilelink verilog" and, if they encounter the
russian-developed RISC-V core or earlier versions of lowRISC, try
*really hard* to use the (unreadable) auto-generated rocketchip
verilog TileLink code that was created from the original chisel3
source

 * google "TileLink" and, on finding that there's *literally* only
chisel3 implementations, anywhere in the world, walk away.

 * google "TileLink" and, on finding that the only implementations
anywhere in the world are in chisel3, evaluate the cost of writing
their own.

this is an extremely fraught and unnecessarily burdensome evaluation
process.  i cannot say if the cost of the chisel3-auto-generated
TileLink code is higher than that of a Wishbone or AXI4 equivalent
implementation.

if however it is written in nmigen, and uses AXI4, there is this:
https://github.com/peteut/migen-axi

if they are an amateur or wish to respect libre licensing etc. etc.,
there is a *lot* of verilog and migen implementations of Wishbone to
choose from.


> But, as the design grows and evolves, then I'd agree, I'd need something very different indeed.  I guess the motivation for my question is recognizing that it's easier to think about this now while the design is still small.

 yes.

>> For AXI4 you simply design some messages that go over the "Control" side to say "hey L2, when you get the data this message is associated with, it's a write-thru uhkaaay?"
>
>
> I'm not an expert at all, but it seems to me that TileLink
> has a similar kind of protocol for its caching support.

 that's what i understand to be the case.  my biggest concern is not
so much whether the implementation(s) are technically superior, it's
addressed by this:

 * how many different teams, across the world, can you find that have
implemented TLB, L1, L2, SMP cache coherence and so on, in whatever
programming language, using AXI4 or Wishbone as the interconnect?

 * how many different teams, across the world, can you find that have
implemented TLB, L1, L2, SMP cache coherence and so on, using
TileLink?

> A single port would need to implement all 5 channels for this to work
> though (right now, since I'm just using TL-UL, I only need to implement
> A and D channels).

*you* could implement them... however what about the users?  what is
the cost *to them*?

i always look beyond the immediate "is this easy", thinking of the
whole picture.  programming, for me, is actually quite challenging.
so i think regularly in terms of "how can i minimise the amount of
programming needed, to complete the *whole* picture?"

this allows me to avoid some extremely costly design and development
decisions, as i cherry-pick from wildly disparate sources and even
take into consideration the creation of cross-language conversion
tools as a way to bridge between two (or more) seemingly utterly
different and otherwise incompatible projects.

l.



More information about the libre-riscv-dev mailing list