[libre-riscv-dev] random testing is insufficient

Fri Jul 5 07:47:17 BST 2019

turns out that the code I saw was for testing the muxid tracking, not
for the fp code.

On Thu, Jul 4, 2019 at 11:37 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> I noticed that the code for testing 32-bit floating-point operations
> is generating 10 pairs of uniformly random 32-bit integers,
> reinterpreting them as 32-bit floats, then verifying that the
> floating-point HW produces the correct values for those 10 pairs. This
> testing is insufficient to catch even obvious errors, such as always
> returning 0 when any input is a NaN (made up flaw). Calculating the
> probability of having even 1 NaN among the 20 random inputs:
> Number-of-NaNs = 2^24-2 = 16777214
> P(number-is-NaN) = Number-of-NaNs/2^32 = 0.3906% (approx.)
> P(number-is-not-NaN) = 1 - P(number-is-NaN) = 99.6094% (approx.)
> P(20-numbers-are-not-NaN) = P(number-is-not-NaN)^20 = 92.4707% (approx.)
> P(1-or-more-NaNs-among-20-inputs) = 1 - P(20-numbers-are-not-NaN) =
> 7.5293% (approx.)
>
> The number of times the tests would need to be run to have a 50%
> chance of seeing 1 NaN is 8.85
>
> For cases where infinities (or zeros) are improperly handled, the test
> suite would need to be run 74.4 million (!!!) times to have a 50%
> chance of hitting even one input being infinity.
>
> I'd suggest switching to using a deterministic generator to produce
> the test inputs, that way special cases can be added when bugs are
> found (regression testing) and we also won't have to worry about test
> reproduciblility, since the testing process would be deterministic
> rather than random.
>
> Additionally, the same sequence can then be trivially used in both the
> input signal generation process and the output checking process.
>
> Jacob