[libre-riscv-dev] [isa-dev] Re: FP transcendentals (trigonometry, root/exp/log) proposal
programmerjake at gmail.com
Fri Sep 13 21:55:51 BST 2019
I think it may be worthwhile to have separate Ztrans extension names to
indicate the levels of accuracy that are implemented, allowing a
low-precision implementation of all instructions outside of F and D while
having full-precision implementations of F and D for code compatibility.
Note that Vulkan requires full ieee754 precision for all F/D instructions
except for fdiv and fsqrt.
fdiv and fsqrt are easy enough to implement in full precision using the
iterative shift-add/shift-sub algorithms that take up similar space to a
few adders and shift registers and can be shared with the integer divider
that I think it may be better to just require full precision mode for F/D -
there can be a separate slow iterative div/sqrt unit if faster
low-precision fdiv/fsqrt are wanted in the main ALUs. the iterative
div/sqrt HW would take up much less space than even a multiplier (unless
multiplication is also iterative, in which case it can also share HW with
the div/sqrt unit).
I would expect there to be a fast HW multiplier even on micropower gpus
because a large proportion of the operations need multiplication so you
could get a overall several hundred percent speedup over an iterative
OpenCL's accuracy requirements are similar to Vulkan's -- full precision
for neg/abs/add/sub/mul/muladd and reduced requirements for everything else.
More information about the libre-riscv-dev