I think it's a pretty bad idea to have non-*pi versions of sin, cos, and
tan be required, since the modular reduction that's required as the first
step for any implementation method that I've heard of needs a very accurate
version of pi in order to produce correctly rounded answers.

if you think that the *pi instructions should not be preferred, maybe we
can split out the trig functions into several extensions that are not
dependent on each other:

Ztrigpi: trig. *-pi

Ztrignpi: trig non-*pi

Zarctrigpi: arc-trig. *pi

Zarctrignpi: arc-trig. non-*pi

I think that the Ztrignpi extension is totally impractical to implement for
our GPU due to f64 needing the remainder of dividing by the 1000+ bit
approximation of 2*pi.

All of Ztrigpi, Zarctrignpi, and Zarctrigpi are practical to implement.

note that using the Ztrigpi instructions to implement sin, cos, and tan
like sin(x) = sinpi(x * (1.0 / pi)) is good enough to meet the accuracy
requirements for Vulkan and OpenCL.

regarding frcp, fatan, and fatanpi, I think we should just use
fdiv/fatan2/fatan2pi and not even have a pseudo-instruction since the
compiler can optimize loading 1.0 into a fp register by moving the
flw/fcvt/fmv/etc. out of loops and stuff.


