[libre-riscv-dev] LD/ST address matcher

Jacob Lifshay programmerjake at gmail.com
Mon Jun 3 20:08:33 BST 2019

One CRC that would probably work quite well is CRC-16-CCITT, I'd estimate
that it needs around 250 xor gates to convert 64 bits to 16 combinatorially.

One way we could use a more expensive but better hash function is to store
the hashed value along with the other data in the load/store queue, that
way we would only need as many hash circuits as the number of load/stores
that get their address computed per clock.

On Mon, Jun 3, 2019, 11:49 Jacob Lifshay <programmerjake at gmail.com> wrote:

> On Mon, Jun 3, 2019, 07:38 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
>> http://git.libre-riscv.org/?p=soc.git;a=blob;f=src/scoreboard/addr_match.py;h=657f95b68851b4df7a40eec031694e2246c3ea49;hb=HEAD
>> Am making progress understanding the LD/ST Dependency Matrix, after
>> staring
>> at it for nearly four hours. More on this later.
>> Another sub-task is to detect overlapping addresses. Or, at least to work
>> out if the addresses MIGHT overlap. If the algorithm is overzealous, all
>> it
>> does is affect performance as the engine has to drop down near to single
>> issue.
>> Jacob, what sort of LD/ST profile can we expect from Kazan? The reason for
>> asking is because the simplest algorithm is to check bits 4 through 15 or
>> so and ignore the bottom 4 bits and everything 16 or above.
> I would expect a general computing profile with the addition of common
> alternating loads and stores to addresses that are separated by a multiple
> of 1MiB or so, so checking the high bits will be important. We could do
> something like compute a simple hash of the higher bits of the addresses
> and compare the hashes. we will need to ensure that we handle different
> virtual addresses mapping to the same physical address correctly, so either
> we need to consider only the low 12-bits or compare physical addresses.
> Obviously, the chosen hash function should have a shorter output, though
> if we find a hash function with good diffusion, we can just take the lower
> 16 bits of the output.
> A website I found that has some hash functions:
> https://homolog.us/blogs/bioinfo/2013/04/28/hardware-based-hash-functions/
> If we come up with a decent CRC polynomial, a CRC should be sufficient,
> since all we need is something where the outputs are equal for equal inputs
> and where the outputs are highly likely to be unequal when the inputs are
> unequal (collision resistance). In particular, we don't need other
> properties of cryptographic hashes such as diffusion or pre-image
> resistance, though pre-image resistance and second pre-image resistance
> would be nice. I think we should select a different polynomial than the
> most common ones, in order to save gates. From what I know, the complexity
> of computing a CRC is proportional to the number of bits set in the CRC
> polynomial, for outputs of the same size.
> For definitions of different terms, such as diffusion or pre-image
> resistance, see:
> https://en.wikipedia.org/wiki/Confusion_and_diffusion
> https://en.wikipedia.org/wiki/Cryptographic_hash_function#Properties
> Jacob

More information about the libre-riscv-dev mailing list