[libre-riscv-dev] random testing is insufficient
Jacob Lifshay
programmerjake at gmail.com
Fri Jul 5 07:37:32 BST 2019
I noticed that the code for testing 32-bit floating-point operations
is generating 10 pairs of uniformly random 32-bit integers,
reinterpreting them as 32-bit floats, then verifying that the
floating-point HW produces the correct values for those 10 pairs. This
testing is insufficient to catch even obvious errors, such as always
returning 0 when any input is a NaN (made up flaw). Calculating the
probability of having even 1 NaN among the 20 random inputs:
Number-of-NaNs = 2^24-2 = 16777214
P(number-is-NaN) = Number-of-NaNs/2^32 = 0.3906% (approx.)
P(number-is-not-NaN) = 1 - P(number-is-NaN) = 99.6094% (approx.)
P(20-numbers-are-not-NaN) = P(number-is-not-NaN)^20 = 92.4707% (approx.)
P(1-or-more-NaNs-among-20-inputs) = 1 - P(20-numbers-are-not-NaN) =
7.5293% (approx.)
The number of times the tests would need to be run to have a 50%
chance of seeing 1 NaN is 8.85
For cases where infinities (or zeros) are improperly handled, the test
suite would need to be run 74.4 million (!!!) times to have a 50%
chance of hitting even one input being infinity.
I'd suggest switching to using a deterministic generator to produce
the test inputs, that way special cases can be added when bugs are
found (regression testing) and we also won't have to worry about test
reproduciblility, since the testing process would be deterministic
rather than random.
Additionally, the same sequence can then be trivially used in both the
input signal generation process and the output checking process.
Jacob
More information about the libre-riscv-dev
mailing list