[libre-riscv-dev] LD/ST address matcher

Jacob Lifshay programmerjake at gmail.com
Mon Jun 3 19:49:58 BST 2019


On Mon, Jun 3, 2019, 07:38 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:

>
> http://git.libre-riscv.org/?p=soc.git;a=blob;f=src/scoreboard/addr_match.py;h=657f95b68851b4df7a40eec031694e2246c3ea49;hb=HEAD
>
> Am making progress understanding the LD/ST Dependency Matrix, after staring
> at it for nearly four hours. More on this later.
>
> Another sub-task is to detect overlapping addresses. Or, at least to work
> out if the addresses MIGHT overlap. If the algorithm is overzealous, all it
> does is affect performance as the engine has to drop down near to single
> issue.
>
> Jacob, what sort of LD/ST profile can we expect from Kazan? The reason for
> asking is because the simplest algorithm is to check bits 4 through 15 or
> so and ignore the bottom 4 bits and everything 16 or above.
>
I would expect a general computing profile with the addition of common
alternating loads and stores to addresses that are separated by a multiple
of 1MiB or so, so checking the high bits will be important. We could do
something like compute a simple hash of the higher bits of the addresses
and compare the hashes. we will need to ensure that we handle different
virtual addresses mapping to the same physical address correctly, so either
we need to consider only the low 12-bits or compare physical addresses.

Obviously, the chosen hash function should have a shorter output, though if
we find a hash function with good diffusion, we can just take the lower 16
bits of the output.

A website I found that has some hash functions:
https://homolog.us/blogs/bioinfo/2013/04/28/hardware-based-hash-functions/
If we come up with a decent CRC polynomial, a CRC should be sufficient,
since all we need is something where the outputs are equal for equal inputs
and where the outputs are highly likely to be unequal when the inputs are
unequal (collision resistance). In particular, we don't need other
properties of cryptographic hashes such as diffusion or pre-image
resistance, though pre-image resistance and second pre-image resistance
would be nice. I think we should select a different polynomial than the
most common ones, in order to save gates. From what I know, the complexity
of computing a CRC is proportional to the number of bits set in the CRC
polynomial, for outputs of the same size.

For definitions of different terms, such as diffusion or pre-image
resistance, see:
https://en.wikipedia.org/wiki/Confusion_and_diffusion
https://en.wikipedia.org/wiki/Cryptographic_hash_function#Properties

Jacob

>


More information about the libre-riscv-dev mailing list