[libre-riscv-dev] [Bug 376] Assess 40/45 nm 2022 target and interfaces

Sat Jun 13 00:11:31 BST 2020

https://bugs.libre-soc.org/show_bug.cgi?id=376

--- Comment #22 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #18)
> (In reply to Luke Kenneth Casson Leighton from comment #14)
> > (In reply to Jacob Lifshay from comment #13)
> > 
> > > > this for the video framebuffer and the video framebuffer only.
> > > 
> > > That doesn't really work, since the GPU will need lots of memory bandwidth
> > > into the framebuffer since that's where it will be rendering to, potentially
> > > drawing over the same pixels several dozen times. To support that, the
> > > memory bandwidth of both the framebuffer and everything else needs to be
> > > spread across all available memory interfaces.
> > 
> > ok let's think it through, internally.  would we have:
> > 
> > * two separate memory interfaces, each dedicated to different address ranges
> > * one L2 (L3?) cache, through which *both* memory interfaces have to go,
> > first
> >   (note: the CPU/GPU as a Wishbone Slave, the RGBTTL HDL as a Master)
> 
> I think what would work the best is for the RGBTTL HDL and every core to be
> a (extended) wishbone master to the L2 (L3?) cache,

no: really.  the cores *have* to take a back seat.  it seems odd that
they have such a highly important role but they are actually "slaves",
however the fact is that in a Shared Memory Architecture, I/O absolutely
cannot be given anything but absolute top priority.

this is just normal practice: I/O *has* to have guaranteed timing:
interrupts cannot go unserviced, buffers are extremely small, and
consequently I/O has to have absolute top priority.

look at the Shakti E-Class HDL, you'll see that the SMP Cores are given
AXI4 "slave" status on the internal bus architecture.

> where the cache logic is
> the arbiter and is designed to give the RGBTTL HDL highest priority and
> everything else round robin (or other) priority. Saving power on scan-out
> when the data is already in cache seems like a good idea, also, the memory
> interfaces would be the tightest bottleneck, why require more accesses to go
> through them when that can be avoided?

yes.  ideally (following the scenario through) the RGB/TTL HDL Master would 
have a completely separate memory bus entirely dedicated to it, reflecting
the fact that it has a completely separate DRAM chip.

however this only starts to be justified if there are say two or three
4k LCDs connected up, where the bandwidth of one 3200 Mbyte/sec interface
would pretty much be soaked up entirely by framebuffers.

> The memory interfaces would be organized into one larger super-interface
> where each memory interface would be responsible for odd or even
> cache-block-sized address blocks. The idea is that accessing something laid
> out contiguously in physical address space would approximate balancing
> evenly across both memory interfaces.

yes.  if going the shared route, this seems eminently sensible.

-- 
You are receiving this mail because:
You are on the CC list for the bug.