[libre-riscv-dev] LD/ST address matcher

Luke Kenneth Casson Leighton lkcl at lkcl.net
Wed Jun 5 07:39:23 BST 2019


On Wednesday, June 5, 2019, Jacob Lifshay <programmerjake at gmail.com> wrote:

> Note that texture access is just one of the places that memory is accessed
> in sequence separated by exact multiples of 1 MiB.


Then in all of those places, the question (which has not been answered)
applies to all of them.

To make the question clear: does striping occur in all situations?

ie are the data structures LDed into memory by way of vectors with stride=1
such that the 1st data structure will automatically cause a stall that, on
the 2nd clock the address difference will now be detected?

And that is assuming that the two vectors will be LDed in an interleaved
fashion, ie the LDs interleaved when their instructions were issued
sequentially and thus the first LD batch will be completed before the 2nd
in any case.

What *may* result in single issue behavior is if the LDs are *not* done as
(batch) vectors.

ie a loop like this:

For i in 0..10000
  For j in 0..100
   Struct data1[i].array32bit[j]
   Struct data2[i+1].array32bit[j]

If however the data indexed by j is a vector, that results in 4 LDs for
example being issued BEFORE the data2 batch of 4 LDs is issued, there is no
problem.

The reason I am persisting with this question is because using the top bits
for matching is nowhere near trivial and has a detrimental impact on
latency.

We *need* to avoid doing large powers of 2 address matching.

L.



-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


More information about the libre-riscv-dev mailing list