[libre-riscv-dev] system call (sc) LEV "reserved field"

Luke Kenneth Casson Leighton lkcl at lkcl.net
Mon Jul 27 09:00:54 BST 2020


---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Mon, Jul 27, 2020 at 5:30 AM Benjamin Herrenschmidt
<benh at kernel.crashing.org> wrote:

> > although i have to ask why, for Embedded, they did not just recompile
> > the source code, customised for that enduser application.
>
> Because the distinction between "embedded" and "unix" is very blurry

benjamin: this is not good!  specifications should *never* be blurry!
i believe you may mean something different, here, which warrants
investigation.  i think i know why, given that the typical markets for
embedded PowerISA cores are specialist areas (aerospace and so on).
Quorl for example is marketed at routers and definitely crosses over,
yet is still termed "embedded".

> and in many areas obsolete.

ok let's look at the v3.1B document, page viii "Compliancy Subsets".

nowhere in there - even in the scalar fixed-point section - am i
seeing mention of what would be expected for "really resource
constrained i.e. *truly* embedded" markets: being able to drop the
(very expensive, gate-wise) logic decoding for catching illegal
instructions, illegal SPR access, and so on.

in SFS the drop in gate count from not needing a FPU is massive: an
FPU typically dwarfs the size of the main core.  once the FPU is
dropped then relatively speaking further savings such as cutting those
needed for illegal instructions becomes signficant (5% saving, 10%
saving) and in mass-volume markets that's absolutely massive.

let us therefore define "resource-constrained embedded markets" as
"truly" embedded, rather than the blurry definition inspired by the
meme that goes by the moniker "IoT", which now includes 8+ watt
raspberry pi 4 devices that run so hot they need fan cooling.

whilst it comes with the "burden" of needing to snapshot and maintain
the full toolchain (which for customisation and custom extensions is a
hard requirement anyway), one of the key areas in which RISC-V has
been successful is the mass-volume *truly* embedded market.

the canonical example is Western Digital, who up until RISC-V had
their own "truly" embedded custom ISA which has been hyper-efficient
for them, for use in SSDs, HDDs and USB Flash devices.  the reason
they wanted to go with RISC-v is to save on maintenance of their own
custom toolchain (bear in mind that with a custom ISA you need not
only a custom toolchain, you need a custom OS and custom applications
as well!  the cost to them of maintaining this would be enormous!)

here, if the binary size is large, this cuts into the usage allocation
of the on-board NAND and on-board RAM (because they don't have
separate NAND from the customer-allocated area, or separate RAM for
the cacheing of customer's data).

if the NAND and RAM allocation are reduced, that has a very real
detrimental impact on their sales and profitability, especially in
such a highly competitive market!

remember that for WD we are talking sales of billions of units, here.
"a few gates" multiplied by a billion sales can make the difference
between profit and loss.

so they found when converting to RISC-V from their own internal ISA
that binary size increased by (iirc) 20%.  they absolutely had to do
something about this and so set about analysing static instruction
allocation and (i am projecting here as this was 2016) they would by
now likely have used that research to create custom instruction
modifications similar to the VLE Book, except targetted specifically
at their use-case.

none of these custom ISA modifications which result in code-size
reduction that is important to their profitability in a very real way
afaik are made public (i have not checked or heard any news so this
could be wrong).

this was all based on the initial base of *not* having to have the
gates for illegal instruction catching etc., they had to go "custom
firmware, custom toolchain" anyway, and the RISC-V initiative provided
them with a huge cost saving on the toolchain and much more.

so they got to have their cake and eat it:

* cost savings on toolchain, kernel, os and bootloader maintenance (by
being able to ride off the back of RISC-V "official")
* cost savings from the "Embedded RISC-V Platform" already having made
decisions that significantly reduced gate count
* cost savings from further reductions by adding custom instructions
and customising the toolchain to match them.

*this* is *truly* embedded.



> One can compile a single image that is meant to run on a wide variety
> of system and even the "embedded" world wants that capability.

if we are talking about the existing (core) OpenPOWER Foundation
members subset embedded markets, served by their current product
offerings: yes.

if we are talking about the *full* (world-wide) definition of
"embedded" which includes "truly" embedded: absolutely not.

and i have to point out that if OpenPOWER's direction only takes into
consideration the former, it is *guaranteed* that Power will never
extend or see wide adoption into the latter.  the cost-benefit
analysis comes up so short that no mass-volume product manufacturer
will consider it, and quite rightly so, as things stand.

this then is the challenge for OpenPOWER if it wishes to have a wider
reach: to adapt to beyond the needs of the current members, whilst
also - very importantly - respecting the long-standing relationship,
contribution and needs *of* those current members at the same time.

it's a delicate balance to achieve.


> Either
> because they end up running some kind of "upstream" OS image, or
> because they don't want to maintain completely different SW images for
> all their products, etc...

indeed.  and this is perfectly reasonable, for what we've colloquially
termed the "blurry" embedded markets.  the cost savings of not having
to recompile or maintain custom packages etc. - these are enormous
savings, absolutely worthwhile pursuing.

unfortunately, if that expectation then propagates throughout the
entire OpenPOWER community as a "hard expectation" (even to the SFS
Compliancy subset), it *automatically and inherently* excludes any
possibility for  "truly" embedded vendors to consider using POWER,
because they are *prohibited* by the Compliancy Requirements from
dropping the gates that would make the product profitable!


> > however this also actually illustrates precisely why i mentioned that
> > for best results, a spec has to have different platform behaviour for
> > Embedded as completely separate and distinct from UNIX.
>
> This is not really true anymore.

thanks to the meme "IoT" and so on the word "embedded" when in general
circulation has become meaningless, yes.  i am not using the term
"Embedded" in the meaningless sense, i am using it in the original
sense used to describe 8 and 16 bit processor markets (when uprated to
32 and sometimes 64 bit).

Embedded in the sense of Arduino PICs, ATMEL ATSAM3 series, ST Micro
STM32F series, and so on. ARM Cortex M0, M3 style and such.

not the "take the latest 64 bit high performance 2.5 ghz quad-core 8+
watt processor, slap it into a SBC form-factor and call it quotes
embedded quotes" definition of Embedded.

> For example, do you consider your cell
> phone or your TV "embedded" or "unix" ?

UNIX, without a shadow of doubt.  Android is a UNIX Platform.  the
Android kernel *is* the linux kernel, and, if end-users are prepared
to put in some effort, all devices running Android - if there isn't
Treacherous DRM built-in to the boot sequence - can have their OS
entirely replaced by any GNU/Linux distro that's compatible with the
processor.

so no - those are *definitely* UNIX platform devices.

> > this on the basis that Embedded Markets are typically one-off
> > deployment, where the toolchain and all build source is "snapshotted"
> > at product release time, used internally and the source code and
> > toolchain almost never see the light of day.  mainline upgrades are
> > exceptionally rare.
>
> There are quite a few counter examples even in the "embedded" market.
> Especially when it comes to storage appliances. Some of these things
> can even run CentOS.

cool!

however, again: are we referring to "general morphed (blurry)"
definition of Embedded, or "truly" embedded?

> > so in that context, i am slightly confused to hear that for Freescale
> > *Embedded* processors that there is even a binary incompatibility
> > problem with lwsync *at all*.
> >
> > if you have time i'd love to hear more, what's the story there?
>
> I forgot the details, but a specific core variant from FSL screwed up
> the decode table and would trap on lwsync instead of ignoring the bit.

and, given the expectation to have binary interoperability even across
"general" (aka colloquially blurry) Embedded, that would matter (a
lot).

> > indeed... in this very special case that we have established, by it
> > effectively being a "hint", falls completely outside of what i am
> > concerned about
>
> In *most* cases trap + emulation leads to unusable performances though.

this is another sub-topic entirely.  the question one needs to ask is:
*why* is performance unusable?  *why* is the cost so high?

RISC-V (again.  there are many good bits to RISC-V) provides
hardware-level partial-decode of instruction sub-fields.  RA, RB, RS,
RT (or, RISC-V equivalents) and other sub-fields, these are all
available via individual SPRs (CSRs in RISC-V terminology) in the
"illegal instruction" trap, saving huge numbers of mask/shift
operations that make trap-and-emulate much faster and require far less
instructions.

without such hardware-level assistance yes i can see that trap and
emulate would be considered completely unacceptable.

we've diverged quite a lot from the original topic :)  but also i
believe provided some insights into the very different needs of
different markets, which pull in completely polar opposite directions
on the same ISA.

the summary is: the Power Spec is a valued and valuable reflection of
the needs of its community.  the question becomes: does that community
want to see OpenPOWER extend further?  if so, it needs to learn from
the analysis done by the RISC-V founders, who analysed *30 years* of
RISC processor architectures and amalgamated the best bits that they
thought would make a modern ISA *and eco-system* supporting a hugely
diverse range of markets, from tiny ("truly") embedded all the way up
to supercomputers.

l.



More information about the libre-riscv-dev mailing list