[libre-riscv-dev] FP transcendentals (trigonometry, root/exp/log) proposal
lkcl
luke.leighton at gmail.com
Thu Aug 8 01:45:23 BST 2019
On Thursday, August 8, 2019 at 1:32:57 AM UTC+1, MitchAlsup wrote:
>
>
>
> On Wednesday, August 7, 2019 at 7:27:08 PM UTC-5, lkcl wrote:
>>
>> [some overlap with what jacob wrote, reviewing/removing redundant replies]
>>
>> On Wednesday, August 7, 2019 at 11:36:17 PM UTC+1, MitchAlsup wrote:
>>>
>>> Is this proposal going to <eventually> include::
>>>
>> a) statement on required/delivered numeric accuracy per transcendental ?
>>>
>>
>> originally thought it was just this:
>> https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html
>>
>> jacob makes and emphasises the point that these are intended to be
>> *scalar* operations, for direct use in libm.
>>
>> b) a reserve on the OpCode space for the double precision equivalents ?
>>>
>>
>>
>> reservations, even where the case has been made clear that the impact of
>> not having a reservation will cause severe detrimental ongoing impact for
>> the wider RISC-V community, do not have an IANA-style contact/proposal
>> procedure. i've repeatedly requested an official reservation, for this and
>> many other proposals.
>>
>> i have not received a response.
>>
>> Jacob wrote:
>>
>> > it would probably be a good idea to split the trancendental extensions
>> > into separate f32, f64, f16, and f128 extensions, since some
>> implementations
>> > may want to only implement them for f32 while still implementing the D
>> > (f64 arithmetic) extension.
>>
>> oh, of course. Ztrans.F/Q/S/H is a really good point.
>>
>> c) a statement on <approximate> execution time ?
>>>
>>
>> what jacob said.
>>
>> as a Standard, we can't limit the proposal in ways that would restrict or
>> exclude implementors. accuracy on the other hand *is* important, because
>> it could potentially cause catastrophic failures if an algorithm is written
>> to critically rely on a given accuracy.
>>
>> You may have more transcendentals than necessary::
>>> 1) for example all of the inverse hyperbolic can be calculated to
>>> GRAPHICs numeric quality with short sequences of already existing
>>> transcendentals
>>> ..... ASINH( x ) = ln( x + SQRT(x**2+1) )
>>>
>>>
>> ah, excellent - i'll add that recipe to the document. Zfhyp, separate
>> extension.
>>
>> 2) LOG(x) = LOGP1(x) + 1.0
>>> ... EXP(x) = EXPM1(x-1.0)
>>>
>>> That is:: LOGP1 and EXPM1 provide greater precision (especially when the
>>> result is near zero) than their sister functions, and the compiler can
>>> easily add the additional instruction to the instruction stream where
>>> appropriate.
>>>
>>
>> oo that's very interesting. of course. i like it.
>>
>> the only thing: as a Standard, some implementors may find it more
>> efficient to implement LOG than LOGP1 (likewise with exp). in particular,
>> if CORDIC is used (which i have just recently found, and am absolutely
>> amazed by - https://en.wikipedia.org/wiki/CORDIC) i cannot find a
>> LOGP1/EXPM1 version of that.
>>
>
> Both Motorola CORDIC and Intel CORDIC specified the LOGP1 and EXP1M
> instead of LOG and EXP.
>
i think i managed to interpret the paper, below - it tends to suggest that
LOG is possible with the standard hyperbolic CORDIC. the thing is: the add
1 is done *outside* the LOG(a), which tends to suggest that the iterative
algorithm needs modifying...
... unless it's as simple as setting Z0=1
does that look reasonable?
[i really don't like deriving algorithms like this from scratch: someone
somewhere has done this, it's so ubiquitous. i'd be much happier - much
more comfortable - when i can see (and execute) a software algorithm that
shows how it's done.]
---
https://www.researchgate.net/publication/230668515_A_fixed-point_implementation_of_the_natural_logarithm_based_on_a_expanded_hyperbolic_CORDIC_algorithm
Since: ln(a) = 2Tanh-1( (a-1) / (a+1)
The function ln(α) is obtained by multiplying by 2 the final result
ZN. (Equation (4)), provided that Z0=0, X0= a+1, and Y0= a-1.
More information about the libre-riscv-dev
mailing list