[libre-riscv-dev] fp special functions

Luke Kenneth Casson Leighton lkcl at lkcl.net
Tue Aug 6 01:12:19 BST 2019


On Tuesday, August 6, 2019, Jacob Lifshay <programmerjake at gmail.com> wrote:

> On Mon, Aug 5, 2019, 16:26 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
>
> > On Monday, August 5, 2019, Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> > wrote:
> >
> > >
> > >>>
> > >>> note that atan(x) and atanpi(x) are just atan2(x, 1.0) and atan2pi(x,
> > >>> 1.0),
> > >>> so the atan and atanpi instructions are not needed
> > >>
> > >>
> > >> Ok great, will move them to pseudo op aliases.
> > >>
> > >>
> > > Hang on... there's no immed for loading 1.0 into an FP reg, it's one of
> > > the downsides of RISCV, a FLD is a hard requirement.
> > >
> > > Hmmm....
> > >
> >
> > "Hmm" means, depending on what an implementor chooses to do, cospi may be
> > more efficient than cos, or vice-versa.
> >
> > As a standard, we don't know which, therefore, to not impose that on
> > implementors, we need both (mandatory)
> >
> I think it's a pretty bad idea to have non-*pi versions of sin, cos, and
> tan be required, since the modular reduction that's required as the first
> step for any implementation method that I've heard of needs a very accurate
> version of pi in order to produce correctly rounded answers.


Could you take a look at the CORDIC paper / implementation, does it work
based on 0-pi or is it 0-1?

My concern is if someone does come up with an implementation that is not
limited by pi: if they have to then *divide* by pi because the ISA only has
the cospi/sinpi/tanpi opcodes, they get penalised the *other* way.

If we were doing an exclusive optimised design not intended to be a GPU
Standard, there would be no question: cospi etc only.

However as a standard we cannot make optimisation decisions as we cannot
predict whether implementors will be penalised by accident.




> if you think that the *pi instructions should not be preferred,


>
> maybe we
> can split out the trig functions into several extensions that are not
> dependent on each other:
>
> Ztrigpi: trig. *-pi
> sinpi
> cospi
> tanpi
>
> Ztrignpi: trig non-*pi
> sin
> cos
> tan
>
> Zarctrigpi: arc-trig. *pi
> atan2pi
> asinpi
> acospi
>
> Zarctrignpi: arc-trig. non-*pi
> atan2
> asin
> acos


Ok.  Happier with that.  Multilib can take care of rhe option proliferation.



>
> I think that the Ztrignpi extension is totally impractical to implement for
> our GPU due to f64 needing the remainder of dividing by the 1000+ bit
> approximation of 2*pi.


Coool! I really want to include that just *because* it's so ridiculous :)




>
> All of Ztrigpi, Zarctrignpi, and Zarctrigpi are practical to implement.
>
> note that using the Ztrigpi instructions to implement sin, cos, and tan
> like sin(x) = sinpi(x * (1.0 / pi)) is good enough to meet the accuracy
> requirements for Vulkan and OpenCL.


iinteresting.


>
> regarding frcp, fatan, and fatanpi, I think we should just use
> fdiv/fatan2/fatan2pi and not even have a pseudo-instruction since the
> compiler can optimize loading 1.0 into a fp register by moving the
> flw/fcvt/fmv/etc. out of loops and stuff.


Ok good point.




-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


More information about the libre-riscv-dev mailing list