[libre-riscv-dev] [isa-dev] Re: FP transcendentals (trigonometry, root/exp/log) proposal
MitchAlsup
MitchAlsup at aol.com
Tue Aug 13 02:05:23 BST 2019
On Monday, August 12, 2019 at 7:11:21 PM UTC-5, lkcl wrote:
>
> On Tuesday, August 13, 2019 at 1:52:16 AM UTC+8, MitchAlsup wrote:
> > On Sunday, August 11, 2019 at 10:20:28 PM UTC-5, lkcl wrote:
> > https://libre-riscv.org/ztrans_proposal/#khronos_equiv
> >
> >
> > I would like to point out that the general implementations of ATAN2 do a
> bunch of special case checks and then simply call ATAN.
>
> Appreciated. I recorded these insights on the page (to move offpage, to
> discussion, at a later point).
>
> > The bottom line is that I think you are choosing to make too many of
> these into OpCodes, making the hardware
> > function/calculation unit (and sequencer) more complicated that
> necessary.
>
> We do have to be careful to ensure that multiple disparate Platform
> implementors are happy, and that tends to suggest that the extension
> remains close to a RISCV ISA paradigm.
>
> >
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> > I might suggest that if there were a way for a calculation to be
> performed and the result of that calculation
> > chained to a subsequent calculation such that the precision of the
> result-becomes-operand is wider than
> > what will fit in a register, then you can dramatically reduce the count
> of instructions in this category while retaining
> > acceptable accuracy:
> >
> >
> > z = x / y
> > can be calculated as::
> > z = x × (1/y)
> >
> >
> > Where 1/y has about 26-to-32 bits of fraction. No, it's not IEEE
> 754-2008 accurate, but GPUs want speed and
> > 1/y is fully pipelined (F32) while x/y cannot be (at reasonable area).
>
> Sigh yehhh this is... ok let me put it this way. If we were doing a from
> scratch dedicated GPU ISA (along the lines of proprietary GPUs, with
> associated software RPC / IPC Marshalling system between the completely
> disparate ISAs) I would in absolutely no way start from a RISC-V base.
>
> That's not because the RISCV Foundation is a pain to deal with, it's
> *technical* reasons, namely that it is a retrofit into an ISA that was
> designed for a completely different market than 3D.
>
> > Given that one has the ability to carry (and process) more fraction
> bits, one can then do high precision
> > multiplies of π or other transcendental radixes.
> >
> >
> > And GPUs have been doing this almost since the dawn of 3D.
>
> Appreciated. Background, first. Can skip if short of time
>
> ---
>
> Basically what you are recommending is a microcode ISA.
The alternative is to designate a few OpCodes in a sequence as a single
result producer, with the intermediate result kept larger than register
width and fed back to the in-sequent instruction (preserving accuracy.)
> This is something that is on the table as an option (an idea floated by
> Atif from Pixilica), and one that we are sort-of looking to put into the
> hardware of the Libre RISCV ALUs, by having a long "opcode" that activates
> *parts* of the pipeline (pre and post FP normalisation and special cases)
> so that it can be share between INT and FP.
>
> Also, 64 bit will be performed by "recycling" intermediary results back
> through the pipeline, again under the control of that microcode-like long
> "opcode". It's a FSM with automatic operand forwarding in other words.
>
> What you describe - the special cases that turn ATAN2 into ATAN - could be
> performed conveniently within the "recycling" paradigm by carrying out the
> special cases as one "cycle", the DIV as another (or the mul and the 1/x as
> two) and finally the FSM hands the intermediate over to ATAN.
>
> The nice thing about this microarchitecture is that the intermediate data
> can be of any width, as well as contain any number of intermediate
> operands.
>
> My feeling is - and this is not ruling out the possibility - that
> microcode ops, exposed to the actual ISA level - would not only need a lot
> of thought, they'd need special attention to be paid to the register file
> (no longer 32 bits, it would be 36 or some other arbitrary width sufficient
> to store the intermediary results, efficiently), and more, as well.
>
> Complicated, and also concern at deviating from RISCV's ISA,
> significantly. Maybe even *increasing* the number of opcodes, due to
> fragmentation of specialist micro operations (such as ATAN2 specialcases).
>
> If those specialcases were done as RISCV operations, that's a *lot* of
> instructions to trade off against simply having ATAN2.
>
> Overall then I think what I am talking myself into is support for the
> pseudo-microcode-like FSM engine within our design, with associated
> "feedback" back to the beginning of the pipeline(s). It is not a full
> blown microcode design, yet has a similar effect, just without needing to
> expose microcode details to the actual ISA.
>
> Other implementors may choose to do things differently, particularly those
> that stick to the UNIX Platform Accuracy profile.
>
> So that is background.
>
> ---
>
> We therefore I think have a case for bringing back ATAN and including
> ATAN2.
>
> The reason is that whilst a microcode-like GPU-centric platform would do
> ATAN2 in terms of ATAN, a UNIX-centric platform would do it the other way
> round.
>
> (that is the hypothesis, to be evaluated for correctness. feedback
> requested).
>
> Thie because we cannot compromise or prioritise one platfrom's
> speed/accuracy over another. That is not reasonable or desirable, to
> penalise one implementor over another.
>
> Thus, all implementors, to keep interoperability, must both have both
> opcodes and may choose, at the architectural and routing level, which one
> to implement in terms of the other.
>
> Allowing implementors to choose to add either opcode and let traps sort it
> out leaves an uncertainty in the software developer's mind: they cannot
> trust the hardware, available from many vendors, to be performant right
> across the board.
>
> Standards are a pig.
>
> L.
>
More information about the libre-riscv-dev
mailing list