[libre-riscv-dev] load/store execution queue idea
    Jacob Lifshay 
    programmerjake at gmail.com
       
    Fri May  1 03:53:59 BST 2020
    
    
  
I filled out some notes on my load/store execution queue idea here:
https://libre-soc.org/3d_gpu/architecture/alternative-design-idea/
The design should be suitable for the final 28nm SoC and should be
able to execute 4 loads or 4 stores or 4 AMOs or 4 fences per cycle,
completely adjustable to some other number if we desire. This totally
replaces the memory dependency matrix. One downside is it doesn't
support forwarding from stores to later loads without going through
the L1 cache.
There's also a section on generalizing the carry look-ahead networks
to be usable for any associative binary operation (the prefix-sum
section).
Jacob
    
    
More information about the libre-riscv-dev
mailing list