[libre-riscv-dev] [isa-dev] FP reciprocal sqrt extension proposal

Thu Jul 11 22:19:50 BST 2019

Just because "that's the way it's always been done" is not a good
reason to justify its continuance.

1/sqrt(a) has been done as single-operand because it's an easy,
independent table-lookup operation, followed by iteration to get the
desired precision. it converges nicely.

however, in real software, the function 1/sqrt(a) almost never stands
alone. it is used for normalization, so it is almost always followed
by a multiplication, ie a/sqrt(b), or preceeded by an addition, ie
1/sqrt(a+b).

saying it is "subjected to rounding twice" isn't really fair. if done
as separate operations, it is subjected to rounding twice.  when done
as an atomic operation, you can arrange extended precision and round
only once.

sorry, i'm not trying to overly advocate for this, just trying to
encourage that we keep an open mind.

guy

On Thu, Jul 11, 2019 at 2:12 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Thu, Jul 11, 2019, 13:42 Guy Lemieux <glemieux at vectorblox.com> wrote:
>>
>> 1/sqrt(a) is a single-operand instruction.
>>
>> might there be more performance value in making it dual-operand to make better use of available read ports, eg:
>>
>> a/sqrt(b)
>>   or
>> 1/sqrt(a+b)
>>
>> both are common forms of usage. i suppose these could be formed by chaining, but if that’s the case there’s little need for rsqrt if you have both div and sqrt.
>
> the reason for not just chaining two instructions is that sqrt followed by div gives different results due to rounding twice. div followed by sqrt gives different results due to rounding twice, returning NaN instead of -Inf for -0 inputs, and very small inputs overflow to +Inf where rsqrt won't ever overflow or underflow.
>
> 1/sqrt(a+b) additionally gives different results for special cases, assuming the addition is defined the same as fadd, since the sign of the result is different for different rounding modes when adding +0 and -0, whereas (if I recall correctly) IEEE 754-2008 defines rsqrt(+0) to give +Inf and rsqrt(-0) to give -Inf for all rounding modes.
>
> I think having a frsqrt instruction that doesn't need an additional input will be useful since the compiler doesn't have to load the constant 1 (for a/sqrt(b)) or +0 or -0 (for 1/sqrt(a+b)).
>
> Additionally, frsqrt is a commonly implemented operation, so already working and verified single-input frsqrt HW is readily available, whereas the modified versions that have 2 inputs aren't as likely to be available, and it would quite complex to implement and verify a new FP unit, greatly restricting who can implement the Zfrsqrt extension.
>
> I have no problems modifying the encoding to permit 2-input frsqrt, I just think that should be an additional extension on top of Zfrsqrt.
>
>
> Jacob