[Libre-soc-bugs] [Bug 724] Determine required memory compiler developments
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Mon Oct 11 23:52:09 BST 2021
https://bugs.libre-soc.org/show_bug.cgi?id=724
--- Comment #6 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Staf Verhaegen from comment #4)
> General principle is that number of ports of a memory reflects the number of
> parallel accesses one needs to do in the same clock cycle. Adding an extra
> port will increase area of the SRAM block.
with 128 64 bit INT and FP regs we may have to do an actual reg cache
for higher clock rates, i would like to avoid that complication in the
180nm / 130nm geometries if practical.
i also do not mind breaking down into a stratified arrangement
of four *separate* regfiles or even 5:
32 regs r0-r31 all accessible
24 regs r32 r36 r40 ... r124
24 regs r33 r37 r41 ... r125
24 regs r34 ... r126
24 regs r35 ... r127
where each of those is 4R1W 64bit with byte-level write-enable.
and if instructions are ever issued add r31, r33, r127 the data
goes into a cyclic buffer that shuffles along, with appropriate
latency, to match up data from regfile port to ALU which will
*also* be stratified (5 different separate ALU banks, yes really,
and yes it'll be a Monster but hey).
however again doing a Monster Vector Processor like this i would
like to avoid in the first iteration @ 130/180 nm
avoiding all external complications like that: bare minimum:
for FP and INT i will be happy with 4R1W on a 128x 64-bit
regfile with byte-level write-enable.
this gives FMAC single cycle and also INT-MAC with room for an INT
predicate read without delay.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-soc-bugs
mailing list