[libre-riscv-dev] [isa-dev] Re: FP transcendentals (trigonometry, root/exp/log) proposal

Mon Sep 9 00:45:45 BST 2019

On Monday, September 9, 2019 at 1:56:09 AM UTC+8, Allen Baum wrote:
> It doesn’t matter whether someone else implements them or not. What matters is their cost (area, power, design time, validation),  and their benefit (primarily their effect on performance. )

Indeed. I do appreciate that. I did a query on academia for DIV units a couple months back and am now being bombarded every few days with different implementations, each with their own unique take on optimisations.

> 
> E.g. tan can be replaced by sin/cos, which is cheap (if you already have sin and cos) and possibly only a little slower. The benefit may be totally trivial, and the cost ( delay in getting the product out, extra architects, validated, implementors) May be substantial.

Yes. You can see from the differences in those commercial GPUs, they each made a different call. That's a clue.

Here's the thing: the requirements in each market are so radically different that it is impossible to call it one way or the other.

For example, the basis of the Pixilica SIGGRAPH2019 BoF was that commercial GPUs are focussing so heavily on mass volume appeal (profit being the driver) that specialist long tail markets are ostracised, interestingly leaving a substantial business opportunity for something "different".

The meetup a few days ago, several very experienced engineers endorsed this alternative approach.

Think Silicon specialise in ultra low power GPUs, typically for example in smartwatches, where idle power has to be in microwatts, and GPU usage measured in milliwatts. Accuracy here is nearly irrelevant.

Then we also have HPC where power consumption is less a priority, and accuracy is paramount.

Performing any kind of quantitative analysis to cover all of these markets is not just pointless it should be blindingly obvious that it is dangerously misleading.

Whatever analysis claims that in *one* market for *one* stakeholder is The Best Way Forward, automatically and inherently penalises others.

Not to mention, as Mitch's patent shows, algorithms can in fact be developed that cover a huge range of transcendentals with the same hardware.

The problem being: that very same hardware, if targetted specially at the less-accurate GPU market is no longer suited to IEE754 high accuracy maktets.

The best that quantitative analysis can be used for is to work out a subdivision of the likelihood of any given subset of transcendentals is suited to a common set of markets.

And even there with the continuing advancements in RTL and algorithms and sheer overwhelming diversity of existing ones, it is still both a massive task and near-pointless at the same time.

In any other area of computer science such as BitManip, where there is huge diversity and a massive range of algorithms, absolutely it makes sense.

Transcendentals are used for 3D and numerical computation, and... errr... that's the lot.

The requirements are to allow these multiple disparate vendors from four possible market areas (platforms: embedded, 3d embedded, UNIX, 3D UNIX) with radically different focii to save money and development time through collaboration and shared APIs in both hardware and software.

Ultimately, with a bit of logical deductive reasoning based on the requirements, we can shortcut the entire process and conclude that the only way to satisfy the requirements is to include pretty much everything, and break it down into subsets in a similar fashion to BitManip.

I can only apologise that I am extremely fast at being able to make such time-saving assessments, and not being very good at explaining how they were arrived at. Which I appreciate does not help when it comes to a formal standard, where everything has to be fully justified in some way.

When someone actually responds with the requirements that I asked for over a month ago on what the official criteria are for standards, this will go a lot more smoothly.

L.