[libre-riscv-dev] [Bug 257] Implement demo Load/Store queueing algorithm

Wed Mar 25 13:07:22 GMT 2020

http://bugs.libre-riscv.org/show_bug.cgi?id=257

--- Comment #17 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
https://libre-riscv.org/3d_gpu/l0_to_l1_n_way_cache_buffer.png

i will add this and a description here, shortly:
https://libre-riscv.org/3d_gpu/architecture/6600scoreboard/

the idea here is to have 6 to 8 LD/ST Function Units, each with
two "ports" (one exclusively for misaligned LDs/STs), which means
that we then have 12 to 16 such "ports" trying to get access to
the L0 Cache/Buffer.

that will require a Multi-Priority-Picker ("and here's one we prepared
earlier....")
https://libre-riscv.org/3d_gpu/architecture/6600scoreboard/800x-multi_priority_picker.png

this can be set up to be 16-in and produce 4x unique exclusive 16-out
"selectors", none of which will overlap.

that in turn allows us to OR/MUX any one of those 16 LD/ST FUs to
any one of the 4 "ports" on the L0 Cache

from there, we can split into two halves using bit 4 as the "selector"
bit, and use bits 5 and above as the "Addr hit".

a priority picker then picks *one* row and broadcasts its address to
all other rows.  if there is a "hit" (same address bits 5 and upwards)
then the data-select bytemask - and data itself - is "merged" into
the output as a "single cache line", to be sent directly to the
L1 cache.

whew.

-- 
You are receiving this mail because:
You are on the CC list for the bug.