[libre-riscv-dev] [Bug 208] implement CORDIC in a general way sufficient to do transcendentals

Wed Apr 15 22:30:33 BST 2020

https://bugs.libre-soc.org/show_bug.cgi?id=208

--- Comment #13 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Michael Nolan from comment #11)
> (In reply to Luke Kenneth Casson Leighton from comment #7)

> > * from that point on treated the mantissa as an integer, *completely
> > ignoring*
> >   the exponent until
> > * finally at the end "re-normalising" the result.
> 
> Ah ok, this makes more sense. 

FP is odd, it can only be done in fixedpoint.  however it is actually
relatively straightforward.  the only real tricky bits are that rounding rules
and interactions can be seriously obtuse.

> 
> > unfortunately, as we will (later) be using the CORDIC block for other
> > purposes,
> > not just SIN/COS but as an actual opcode, if it doesn't do "actual CORDIC"
> > we're stuffed, there.
> 
> Ah I see. So at some point we'll be wanting to use this for general
> rotations, hyperbolic sin/cos, and such? 

yes.  a general purpose CORDIC engine with all possible modes (i think there
are at least 3: linear, rotate and hyperbolic) and to make an instruction that
can access aaall those modes...

...yet for sin, cos, log etc use that exact same engine by setting some of
those parameters manually.

> > 
> > i suspect that it might be possible, as the size of the ATAN2 constants
> > gets smaller, to "shift" the result up slightly.
> > 
> > this we can do - test out - right now long before trying to move to FP.
> > each CordicStage can take an extra argument which specifies that dx, dy and
> > dz should be shifted by N extra bits.
> 
> I think I sort of understand - you want to scale up dx, dy (and presumably x
> and y) when they're small so that they have more bits of precision. 

correct.

however we cannot lose sight of how *many* shifts occurred (in each stage and
cumulatively) because each shift would be a doubling (halving?) of the ultimate
answer.

i honestly do not know if it is a good idea to have dynamic shifting or not. it
would mean testing both x and y, but the problem is that might get small whilst
y gets large.

keeping track of whether shifting occurs, it is actually an exponent! you
double the mantissa (shift x/y up) however you need to *subtract* one from the
exponent when doing so.

and because x can get large while y gets small and vice versa now you would
need *two* exponents. worse than that, you need 2 copies of z because one z
needs likewise shifting down and the other up.  too complicated.

hm.

i'm therefore tempted to suggest dropping one of x or y and setting a mode to
decide whether to compute a sin or cos result (but not both).

see how it goes.

> One thing that occurred to me - since power (and I assume riscv) don't
> include any sin/cos opcodes but instead use a function in libc to calculate
> them, the cordic would only be used by the GPU, right?

well if we propose the opcodes for fpsin and fpcos and they are accepted by the
OpenPOWER Foundation for a future version of POWER then yes some hardware would
call the opcodes whilst others would emulate it using libc.  or libm.

> If that's the case,
> do sin/cos necessarily need to be as accurate as the result from libc?

we need to be Vulkan compliant however i think as long as it does not use
massive amounts of power we can do IEEE754 compliance.

if you mean testing *against* libc (libm), i honestly do not know if libm is
fully IEEE754 compliant and that's why i suggested testing against python
bigfloat for now.

later we will definitely need to use jacob's algorithmic library, to make sure
we have full compliance at the bit level.

-- 
You are receiving this mail because:
You are on the CC list for the bug.