[libre-riscv-dev] [Bug 208] implement CORDIC in a general way sufficient to do transcendentals

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Tue May 5 17:03:05 BST 2020


--- Comment #49 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Michael Nolan from comment #39)
> (In reply to Luke Kenneth Casson Leighton from comment #37)
> > ah excellent!  you did a FSM-based version as well, that's superb.
> > if the pipelined version turns out to be insanely gate-hungry, or
> > just not that important, we can use it.
> Ooof, yeah. Running synthesis on the 32 bit pipelined version (4 cordic
> rounds per pipeline stage) gives somewhere around 77k gates.

uh-huhn :)

so you see why i thought it might be a teeeny tiny bit of a good idea
to make it as general-purpose as possible?  used for a batch of
operations not just SIN/COS.

and why most GPUs have a massive concerted effort to shrink transcendentals
as much as possible.

Mitch's patents (bought by Samsung) actually use... what is it...
they use cubic equations, which is extremely efficient and, more than
that, accurate to >1.0 ULP (unit in last place) for the majority
of FP32 operations.

he gets about a quarter of the gate count of other SIN/COS hard macros.

if the gate count is too massive for FP64, we may *have* to use the
FSM-based version for FP64.  the priority as a GPU is on FP32.

You are receiving this mail because:
You are on the CC list for the bug.

More information about the libre-riscv-dev mailing list