[libre-riscv-dev] [isa-dev] Re: FP transcendentals (trigonometry, root/exp/log) proposal
programmerjake at gmail.com
Thu Aug 8 12:47:22 BST 2019
On Thu, Aug 8, 2019, 04:25 lkcl <luke.leighton at gmail.com> wrote:
> On Thursday, August 8, 2019 at 11:09:28 AM UTC+1, Jacob Lifshay wrote:
> maybe a solution would be to add an extra field to the fp control csr (or
>> isamux?) to allow selecting one of several accurate or fast modes:
> *thinks*... *puzzled*... hardware can't be changed, so you'd need to
> pre-allocate the gates to cope with e.g. UNIX Platform spec (libm
> interoperability), so why would you need a CSR to switch "modes"?
> ah, ok, i think i got it, and it's [potentially] down to the way we're
> designing the ALU, to enter "recycling" of data through the pipeline to
> give better accuracy.
> are you suggesting that implementors be permitted to *dynamically* alter
> the accuracy of the results that their hardware produces, in order to
> comply with *more than one* of the [four so far] proposed Platform Specs,
> *at runtime*?
also, having explicit mode bits allows emulating more accurate operations
when the HW doesn't actually implement the extra gates needed.
This allows greater software portability (allows converting a libm call
into a single instruction without requiring hw that implements the required
> thus, for example, our hardware would (purely as an example) be optimised
> to produce OpenCL-compliant results during "3D GPU Platform mode", and as
> such would need less gates to do so. HOWEVER, for when that exact same
> hardware was used in the GNU libm library, it would set "UNIX Platform FP
> hardware mode", and consequently produce results that were accurate to UNIX
> Platform requirements (whatever was decided - IEEE754, 0.5 ULP precision,
> etc. etc. whatever it was).
> in this "more accurate" mode, the latency would be increased... *and we
> wouldn't care* [other implementors might], because it's not
> performance-critical: the switch is just to get "compliance".
> that would allow us to remain price-performance-watt competitive with
> other GPUs, yet also meet UNIX Platform requirements.
> something like that?
I do think that there should be an exact-rounding mode even if the UNIX
platform doesn't require that much accuracy, otherwise, HPC implementations
(or others who need exact rounding) will run into the same dilemma of
needing more instruction encodings again.
More information about the libre-riscv-dev