[libre-riscv-dev] FP transcendentals (trigonometry, root/exp/log) proposal

Thu Aug 8 01:45:23 BST 2019

On Thursday, August 8, 2019 at 1:32:57 AM UTC+1, MitchAlsup wrote:
>
>
>
> On Wednesday, August 7, 2019 at 7:27:08 PM UTC-5, lkcl wrote:
>>
>> [some overlap with what jacob wrote, reviewing/removing redundant replies]
>>
>> On Wednesday, August 7, 2019 at 11:36:17 PM UTC+1, MitchAlsup wrote:
>>>
>>> Is this proposal going to <eventually> include:: 
>>>
>> a) statement on required/delivered numeric accuracy per transcendental ?
>>>
>>
>> originally thought it was just this: 
>> https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html
>>
>> jacob makes and emphasises the point that these are intended to be 
>> *scalar* operations, for direct use in libm.
>>
>> b) a reserve on the OpCode space for the double precision equivalents ?
>>>
>>
>>
>> reservations, even where the case has been made clear that the impact of 
>> not having a reservation will cause severe detrimental ongoing impact for 
>> the wider RISC-V community, do not have an IANA-style contact/proposal 
>> procedure.  i've repeatedly requested an official reservation, for this and 
>> many other proposals.
>>
>> i have not received a response.
>>
>> Jacob wrote:
>>
>> > it would probably be a good idea to split the trancendental extensions
>> > into separate f32, f64, f16, and f128 extensions, since some 
>> implementations 
>> > may want to only implement them for f32 while still implementing the D
>> > (f64 arithmetic) extension.
>>
>> oh, of course. Ztrans.F/Q/S/H is a really good point.
>>
>> c) a statement on <approximate> execution time ?
>>>
>>
>> what jacob said.
>>
>> as a Standard, we can't limit the proposal in ways that would restrict or 
>> exclude implementors.  accuracy on the other hand *is* important, because 
>> it could potentially cause catastrophic failures if an algorithm is written 
>> to critically rely on a given accuracy.
>>
>> You may have more transcendentals than necessary::
>>> 1) for example all of the inverse hyperbolic can be calculated to 
>>> GRAPHICs numeric quality with short sequences of already existing 
>>> transcendentals
>>> ..... ASINH( x ) = ln( x + SQRT(x**2+1) )
>>>
>>>
>> ah, excellent - i'll add that recipe to the document.   Zfhyp, separate 
>> extension.
>>
>> 2) LOG(x) = LOGP1(x) + 1.0
>>> ... EXP(x) = EXPM1(x-1.0)
>>>
>>> That is:: LOGP1 and EXPM1 provide greater precision (especially when the 
>>> result is near zero) than their sister functions, and the compiler can 
>>> easily add the additional instruction to the instruction stream where 
>>> appropriate.
>>>
>>
>> oo that's very interesting.   of course.  i like it.
>>
>> the only thing: as a Standard, some implementors may find it more 
>> efficient to implement LOG than LOGP1 (likewise with exp).  in particular, 
>> if CORDIC is used (which i have just recently found, and am absolutely 
>> amazed by - https://en.wikipedia.org/wiki/CORDIC) i cannot find a 
>> LOGP1/EXPM1 version of that.
>>
>
> Both Motorola CORDIC and Intel CORDIC specified the LOGP1 and EXP1M 
> instead of LOG and EXP. 
>

i think i managed to interpret the paper, below - it tends to suggest that 
LOG is possible with the standard hyperbolic CORDIC.  the thing is: the add 
1 is done *outside* the LOG(a), which tends to suggest that the iterative 
algorithm needs modifying...

... unless it's as simple as setting Z0=1

does that look reasonable?

[i really don't like deriving algorithms like this from scratch: someone 
somewhere has done this, it's so ubiquitous.  i'd be much happier - much 
more comfortable - when i can see (and execute) a software algorithm that 
shows how it's done.]

---

https://www.researchgate.net/publication/230668515_A_fixed-point_implementation_of_the_natural_logarithm_based_on_a_expanded_hyperbolic_CORDIC_algorithm

Since: ln(a) = 2Tanh-1( (a-1) / (a+1)

The function ln(Î±) is obtained by multiplying by 2 the final result 

ZN. (Equation (4)), provided that Z0=0, X0= a+1, and Y0= a-1.