[libre-riscv-dev] fp special functions

Sun Aug 4 19:31:47 BST 2019

Excellent thank you so much Jacob and Luke. Have you seen this:

https://github.com/freecores/trigonometric_functions_in_double_fpu

https://github.com/freecores/tanhapprox

https://github.com/freecores/pipelined_fixed_point_elementary_functions

Atif

________________________________
From: Jacob Lifshay <programmerjake at gmail.com>
Sent: Sunday, August 4, 2019 2:16:52 PM
To: Libre-RISCV General Development <libre-riscv-dev at lists.libre-riscv.org>; Atif Zafar <atif at pixilica.com>; Grant Jennings <gnarlygreyllc at gmail.com>
Subject: fp special functions

On Sat, Aug 3, 2019, 22:25 Luke Kenneth Casson Leighton <lkcl at lkcl.net<mailto:lkcl at lkcl.net>> wrote:
got an idea, transcendentals (scalar) proposal, similar to Zfrsqrt,
need to find the space for sin, cos, atan, exp, pow, log, and so on.
on-list first, then isa-dev?
Sounds good to me.

I think we should have our primitive instructions be correctly rounded, since, for all but sin/cos/tan/sec/cosec/cotan, that doesn't take much more precision. I think we should implement sinpi/cospi and friends since they avoid the need to have an extremely (several hundred bit) accurate version of pi.

Note for Atif and Grant: I'm currently working on an algebraic numbers library that can be used to verify the fp implementations for add/sub/mul/div/sqrt/rsqrt/cbrt/hypot.
https://salsa.debian.org/Kazan-team/algebraics

Note that even though sinpi/cospi theoretically are algebraic numbers for rational inputs, the degree of the polynomials is prohibitive for large denominatos.

We should avoid the pitfall of intel's x87 sin/cos implementations:
https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/

The functions I think are worth implementing in addition to F/D:

trig-pi functions (range reduction is trivial (x mod 2.0)):
* sinpi
* cospi
* sincospi (non-standard; like sincos)
* atan2pi

extended trig-pi functions (separate extension; sincospi/atan2pi is sufficient for graphics)
* tanpi
* asinpi
* acospi

non-*pi trig functions (in a separate extension since accurate range reduction is quite difficult, approximating using the *pi functions will work for graphics):
* sin
* cos
* sincos
* tan
* atan2
* asin
* acos

powers:
* cbrt
* hypot (avoids overflow/underflow with extended exponent range for intermediates)
* rsqrt (proposed in Zfrsqrt extension)

general powers (as separate extension due to complexity; exp2/log2 plus checking for odd powers/roots is sufficient for graphics):
* pow
* root

exp/log:
* exp2
* log2
* expm1 (extra precision around 0)
* logp1 (extra precision around 0)

extended exp/log as separate extension (not needed for graphics since exp2/log2 is sufficient):
* exp
* log
* exp10
* log10

hyperbolics as separate extension (not needed for graphics since exp2/log2 is sufficient):
* acosh
* asinh
* atanh
* cosh
* sinh
* tanh (may want to split out in separate extension since sometimes used for machine learning, however fmax(x,x*(1.0/256.0)) is a generally sufficient replacement transfer function)

the erf/erfc/gamma/bessel/zeta/etc. functions can be left to software implementations.

see also:
https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html
https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_Env.html#relative-error-as-ulps
https://www.khronos.org/registry/spir-v/specs/unified1/GLSL.std.450.html
https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap40.html#spirvenv-precision-operation

Jacob Lifshay