[libre-riscv-dev] FP transcendentals (trigonometry, root/exp/log) proposal
lkcl
luke.leighton at gmail.com
Thu Aug 8 01:27:07 BST 2019
[some overlap with what jacob wrote, reviewing/removing redundant replies]
On Wednesday, August 7, 2019 at 11:36:17 PM UTC+1, MitchAlsup wrote:
>
> Is this proposal going to <eventually> include::
>
a) statement on required/delivered numeric accuracy per transcendental ?
>
originally thought it was just this:
https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html
jacob makes and emphasises the point that these are intended to be *scalar*
operations, for direct use in libm.
b) a reserve on the OpCode space for the double precision equivalents ?
>
reservations, even where the case has been made clear that the impact of
not having a reservation will cause severe detrimental ongoing impact for
the wider RISC-V community, do not have an IANA-style contact/proposal
procedure. i've repeatedly requested an official reservation, for this and
many other proposals.
i have not received a response.
Jacob wrote:
> it would probably be a good idea to split the trancendental extensions
> into separate f32, f64, f16, and f128 extensions, since some
implementations
> may want to only implement them for f32 while still implementing the D
> (f64 arithmetic) extension.
oh, of course. Ztrans.F/Q/S/H is a really good point.
c) a statement on <approximate> execution time ?
>
what jacob said.
as a Standard, we can't limit the proposal in ways that would restrict or
exclude implementors. accuracy on the other hand *is* important, because
it could potentially cause catastrophic failures if an algorithm is written
to critically rely on a given accuracy.
You may have more transcendentals than necessary::
> 1) for example all of the inverse hyperbolic can be calculated to GRAPHICs
> numeric quality with short sequences of already existing transcendentals
> ..... ASINH( x ) = ln( x + SQRT(x**2+1) )
>
>
ah, excellent - i'll add that recipe to the document. Zfhyp, separate
extension.
2) LOG(x) = LOGP1(x) + 1.0
> ... EXP(x) = EXPM1(x-1.0)
>
> That is:: LOGP1 and EXPM1 provide greater precision (especially when the
> result is near zero) than their sister functions, and the compiler can
> easily add the additional instruction to the instruction stream where
> appropriate.
>
oo that's very interesting. of course. i like it.
the only thing: as a Standard, some implementors may find it more efficient
to implement LOG than LOGP1 (likewise with exp). in particular, if CORDIC
is used (which i have just recently found, and am absolutely amazed by
- https://en.wikipedia.org/wiki/CORDIC) i cannot find a LOGP1/EXPM1 version
of that.
CORDIC would be the most sensible "efficient" choice of hardware algorithm,
simply because of the sheer overwhelming number of transcendentals that it
covers. if there isn't a way to implement LOGP1 using CORDIC, and one but
not the other is chosen, some implementation options will be limited /
penalised.
this is one of the really tricky things about Standards. if we were doing
a single implementation, not intended in any way to be Standards-compliant,
we could make the decision, best optimised option, according to our
requirements, and to hell with everyone else. take that approach with a
Standard, and it results in... other teams creating their own Standard.
having two near-identical opcodes where one may be implemented in terms of
the other is however rather unfortunately against the principle of RISC.
in this particular case, though, the hardware implementation actually
matters.
does anyone know if CORDIC can be adapted to do LOGP1 as well as LOG? ha,
funny, i found this:
http://dns.uls.cl/~ej/daa_08/Algoritmos/books/book10/9010f/jarvis.asc
unfortunately, the original dr dobbs article, which has "example 4(d)" as a
hyperlink, redirects to a 404 not found.
l.
More information about the libre-riscv-dev
mailing list