[libre-riscv-dev] [Bug 208] implement CORDIC in a general way sufficient to do transcendentals

Wed Apr 15 20:14:07 BST 2020

https://bugs.libre-soc.org/show_bug.cgi?id=208

--- Comment #7 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Michael Nolan from comment #6)
> (In reply to Luke Kenneth Casson Leighton from comment #5)
> > ok so would you like to do the next step? which is, yep you guessed it:
> > IEEE754 FP sin and cos :)
> 
> I'm not really sure how to do this though. I know you did a similar thing on
> the divider, did you handle it as a float the entire way through or convert
> it to integer... somehow?

integer, and kept things at the same exponent.  i.e. we:

* adjusted the relative range of both numbers a and b so that they
  had the same exponent (-1 from exp, shift mantissa until exp(a)==exp(b)
* keeping enough bits (3x the IEEE754 length in some places) at all times
  so that there will be no loss of precision
* from that point on treated the mantissa as an integer, *completely ignoring*
  the exponent until
* finally at the end "re-normalising" the result.

> Doing it as a float means that the adders/subtractors need to be full
> floating point adders, like those in fpadd/, right? 

actually fpadd simply does the same trick above as well.  here's the original
jon dawson verilog code that we started from:

  https://github.com/dawsonjon/fpu/blob/master/adder/adder.v#L174

you can see there's no "actual" FPadds, it's literally, "line up the
exponents, do an integer add, then adjust/normalise the result".

> For converting the inputs to integers, it seems like there would be issues
> as the input angle approached 0 (because floats give more resolution there).

it's ok to have double (or even triple) the number of bits being used for the
fixed-integer computation.  so where the mantissa for FP32 is 23 bits, it's
ok to use 23*3 = 69 bits (plus a few more for rounding).

> Cole emailed me a paper (which I linked to in the resources page) that
> contained a hybrid cordic that switched to a taylor series approximation
> when the input became small enough. This seems like it might work reasonably
> well for handling the small input case. 

unfortunately, as we will (later) be using the CORDIC block for other purposes,
not just SIN/COS but as an actual opcode, if it doesn't do "actual CORDIC"
we're stuffed, there.

i suspect that it might be possible, as the size of the ATAN2 constants
gets smaller, to "shift" the result up slightly.

this we can do - test out - right now long before trying to move to FP.
each CordicStage can take an extra argument which specifies that dx, dy and
dz should be shifted by N extra bits.

the transfer of input to output needs to take into consideration that the
next stage is going to be shifted by N.

in this way, smaller numbers don't end up losing quite so many bits.
the only thing we have to make sure is that, throughout the whole pipeline,
there's enough extra bits such that the shifts do *not* end hitting the
ceiling of the maximum range of x (or y).

-- 
You are receiving this mail because:
You are on the CC list for the bug.