[libre-riscv-dev] FP transcendentals (trigonometry, root/exp/log) proposal

Thu Aug 8 01:32:57 BST 2019

On Wednesday, August 7, 2019 at 7:27:08 PM UTC-5, lkcl wrote:
>
> [some overlap with what jacob wrote, reviewing/removing redundant replies]
>
> On Wednesday, August 7, 2019 at 11:36:17 PM UTC+1, MitchAlsup wrote:
>>
>> Is this proposal going to <eventually> include:: 
>>
> a) statement on required/delivered numeric accuracy per transcendental ?
>>
>
> originally thought it was just this: 
> https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html
>
> jacob makes and emphasises the point that these are intended to be 
> *scalar* operations, for direct use in libm.
>
> b) a reserve on the OpCode space for the double precision equivalents ?
>>
>
>
> reservations, even where the case has been made clear that the impact of 
> not having a reservation will cause severe detrimental ongoing impact for 
> the wider RISC-V community, do not have an IANA-style contact/proposal 
> procedure.  i've repeatedly requested an official reservation, for this and 
> many other proposals.
>
> i have not received a response.
>
> Jacob wrote:
>
> > it would probably be a good idea to split the trancendental extensions
> > into separate f32, f64, f16, and f128 extensions, since some 
> implementations 
> > may want to only implement them for f32 while still implementing the D
> > (f64 arithmetic) extension.
>
> oh, of course. Ztrans.F/Q/S/H is a really good point.
>
> c) a statement on <approximate> execution time ?
>>
>
> what jacob said.
>
> as a Standard, we can't limit the proposal in ways that would restrict or 
> exclude implementors.  accuracy on the other hand *is* important, because 
> it could potentially cause catastrophic failures if an algorithm is written 
> to critically rely on a given accuracy.
>
> You may have more transcendentals than necessary::
>> 1) for example all of the inverse hyperbolic can be calculated to 
>> GRAPHICs numeric quality with short sequences of already existing 
>> transcendentals
>> ..... ASINH( x ) = ln( x + SQRT(x**2+1) )
>>
>>
> ah, excellent - i'll add that recipe to the document.   Zfhyp, separate 
> extension.
>
> 2) LOG(x) = LOGP1(x) + 1.0
>> ... EXP(x) = EXPM1(x-1.0)
>>
>> That is:: LOGP1 and EXPM1 provide greater precision (especially when the 
>> result is near zero) than their sister functions, and the compiler can 
>> easily add the additional instruction to the instruction stream where 
>> appropriate.
>>
>
> oo that's very interesting.   of course.  i like it.
>
> the only thing: as a Standard, some implementors may find it more 
> efficient to implement LOG than LOGP1 (likewise with exp).  in particular, 
> if CORDIC is used (which i have just recently found, and am absolutely 
> amazed by - https://en.wikipedia.org/wiki/CORDIC) i cannot find a 
> LOGP1/EXPM1 version of that.
>

Both Motorola CORDIC and Intel CORDIC specified the LOGP1 and EXP1M instead 
of LOG and EXP. 

>
> CORDIC would be the most sensible "efficient" choice of hardware 
> algorithm, simply because of the sheer overwhelming number of 
> transcendentals that it covers.  if there isn't a way to implement LOGP1 
> using CORDIC, and one but not the other is chosen, some implementation 
> options will be limited / penalised.
>
> this is one of the really tricky things about Standards.  if we were doing 
> a single implementation, not intended in any way to be Standards-compliant, 
> we could make the decision, best optimised option, according to our 
> requirements, and to hell with everyone else.  take that approach with a 
> Standard, and it results in... other teams creating their own Standard.
>
> having two near-identical opcodes where one may be implemented in terms of 
> the other is however rather unfortunately against the principle of RISC.  
> in this particular case, though, the hardware implementation actually 
> matters.
>
> does anyone know if CORDIC can be adapted to do LOGP1 as well as LOG?  ha, 
> funny, i found this:
> http://dns.uls.cl/~ej/daa_08/Algoritmos/books/book10/9010f/jarvis.asc
>
> unfortunately, the original dr dobbs article, which has "example 4(d)" as 
> a hyperlink, redirects to a 404 not found.
>
> l.
>
>