[libre-riscv-dev] cache SRAM organisation

Luke Kenneth Casson Leighton lkcl at lkcl.net
Thu Mar 26 15:15:06 GMT 2020

On Thursday, March 26, 2020, Staf Verhaegen <staf at fibraservi.eu> wrote:
> You seem to mixing up two different concepts, e.g. synchronicity and
> write-through.

ah ok.  i didn't realise that SRAMs would have these as separate concepts.
perhaps i would get it immediately if i saw a gate level schematic.

> Synchronous means signals are synced with an edge of a (clock) signal.
> SRAM write-through means that after a write operation you also get on the Q
> output the data you have just written. These two concepts are orthogonal to
> each other.
> The current synchronous SRAM being developed will most likely have
> write-through behavior;

ah good

> will be confirmed before May test chip tape-out.

ah May not March.

>  It will cause delay on the signal though. I need to check if it has
> changed but in the OpenRAM 0.35um test tape-out I did the address and data
> input was latched on rising edge and the Q output was updated on falling
> edge of the clock. So the delay on the Q output is half a clock cycle plus
> the internal delay on the output latch enable signal.

fascinating.  that's pretty much exactly what the CDC6600 did on its
register file.

> So if timing of the write-through is critical it is still best to still
> include MUXs as said in Jacob's reply to allow the bypass ofsignal. I have
> seen SRAM that did include a AWT (asynchronous write through) but this just
> moved the MUXs inside the SRAM block and also adds them if you don't need
> this AWT. So I would like to keep these MUX be added added externally is
> needed.


> > mode.  the reason why we need the synchronous mode is because some
> > Function Units will be sitting idle,waiting for their input operands,
which have to come from
> > otherFunction Units as "results".

> I can understand you do this to implement functional units with
> configurable pipeline length but I would strongly discourage to pipeline
> register files after each other .

"pipeline register files after each other"? apologies i am not clear what
you mean, here.  do you mean "don't do write-thru on the Regfile"?

this design is not like an InOrder system where it is very risky to change
pipeline lengths or do any kind of fancy work.

the (parallel) OoO design has all data dependencies sorted out, all units
*know exactly* which units they are waiting for, they can be arbitrary
completion time, they can be FSMs, and the Dependency Matrices just don't
care because they record completion *not* completion *time*.

therefore the only thing stopping them from proceeding is how fast they can
get a result, so that other FUs can get at it as an input operand as quick
as possible.

that means either: a Register Forwarding Bus, or it means a write to the
Regfile followed by a read.

therefore anything we can do to reduce the number of clock cycles between
that regfile write and subsequent read is a good thing.

if write-thru is available, we can use it to reduce delay by one cycle.
however yes i note that because it is on the falling edge we might not be
able to take full advantage of it.

If the latter is excluded would you still need an asynchronous RAM block ?

i can't tell, yet, apologies.  i believe the answer is no.

Staf it would help enormously i think if you could let us know, in terms of
what options to nmigen Memory module, are permitted.  i am assuming that
nmigen Memory actually supports the options correctly!

we can then "wrap" the Memory module (or hook it) and throw asserts if we
ever use options that are not supported.


crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

More information about the libre-riscv-dev mailing list