[libre-riscv-dev] [Bug 127] Transcendentals needed (SIN/COS/TAN/EXP/LOG/RCP/POW etc.)

Sun Aug 4 22:46:38 BST 2019

http://bugs.libre-riscv.org/show_bug.cgi?id=127

--- Comment #1 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
Jacob Lifshay via lists.libre-riscv.org 
7:17 PM (3 hours ago)
to Libre-RISCV, Atif, Grant

On Sat, Aug 3, 2019, 22:25 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:

> got an idea, transcendentals (scalar) proposal, similar to Zfrsqrt,
> need to find the space for sin, cos, atan, exp, pow, log, and so on.
> on-list first, then isa-dev?
>
Sounds good to me.

I think we should have our primitive instructions be correctly rounded,
since, for all but sin/cos/tan/sec/cosec/cotan, that doesn't take much more
precision. I think we should implement sinpi/cospi and friends since they
avoid the need to have an extremely (several hundred bit) accurate version
of pi.

Note for Atif and Grant: I'm currently working on an algebraic numbers
library that can be used to verify the fp implementations for
add/sub/mul/div/sqrt/rsqrt/cbrt/hypot.
https://salsa.debian.org/Kazan-team/algebraics

Note that even though sinpi/cospi theoretically are algebraic numbers for
rational inputs, the degree of the polynomials is prohibitive for large
denominatos.

We should avoid the pitfall of intel's x87 sin/cos implementations:
https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/

The functions I think are worth implementing in addition to F/D:

trig-pi functions (range reduction is trivial (x mod 2.0)):
* sinpi
* cospi
* sincospi (non-standard; like sincos)
* atan2pi

extended trig-pi functions (separate extension; sincospi/atan2pi is
sufficient for graphics)
* tanpi
* asinpi
* acospi

non-*pi trig functions (in a separate extension since accurate range
reduction is quite difficult, approximating using the *pi functions will
work for graphics):
* sin
* cos
* sincos
* tan
* atan2
* asin
* acos

powers:
* cbrt
* hypot (avoids overflow/underflow with extended exponent range for
intermediates)
* rsqrt (proposed in Zfrsqrt extension)

general powers (as separate extension due to complexity; exp2/log2 plus
checking for odd powers/roots is sufficient for graphics):
* pow
* root

exp/log:
* exp2
* log2
* expm1 (extra precision around 0)
* logp1 (extra precision around 0)

extended exp/log as separate extension (not needed for graphics since
exp2/log2 is sufficient):
* exp
* log
* exp10
* log10

hyperbolics as separate extension (not needed for graphics since exp2/log2
is sufficient):
* acosh
* asinh
* atanh
* cosh
* sinh
* tanh (may want to split out in separate extension since sometimes used
for machine learning, however fmax(x,x*(1.0/256.0)) is a generally
sufficient replacement transfer function)

the erf/erfc/gamma/bessel/zeta/etc. functions can be left to software
implementations.

see also:
https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html
https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_Env.html#relative-error-as-ulps
https://www.khronos.org/registry/spir-v/specs/unified1/GLSL.std.450.html
https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap40.html#spirvenv-precision-operation

-- 
You are receiving this mail because:
You are on the CC list for the bug.