[libre-riscv-dev] [OpenPOWER-HDL-Cores] system call (sc) LEV "reserved field"

Benjamin Herrenschmidt benh at kernel.crashing.org
Thu Jul 23 01:52:35 BST 2020


On Thu, 2020-07-23 at 00:41 +0100, Luke Kenneth Casson Leighton wrote:
> 
> because i did not expect that behaviour, because doing so (ignoring
> them) makes it impossible to trap and emulate.  (it becomes necessary
> to use JIT analysis)
> 
> so, when some bit is added in the future, an older processor (and the
> device it is in) basically has to be thrown into landfill.

Depends, it goes both ways. For example, this is what allowed the
addition of lwsync without throwing older processors into the landfill
because it would automatically "escalate" to a full sync on processors
that didn't implement it (or should have... FSL did screw that up on
some cores).
 
> if however reserved bits being set cause an exception, the "old"
> processor stands a chance of emulating the new behaviour (in
> software, even if that's slow), giving it a chance of keeping out of
> landfill for slightly longer.

Which is why powerpc tends not to "add bits" to instructions unless
ignoring them is a safe fallback.

> however it is not appropriate for all systems to raise exceptions on
> reserved bits: the cost of having the detection hardware (a full
> POWER9 decoder and also illegal/unsupported/reserved SPR detection)
> can be very high especially for resource and power constrained
> silicon or FPGAs.
> 
> (example: i know someone - yea, you Sam - who implemented RV64 to
> comply with the UNIX RISCV spec rather than the Embedded RISCV spec:
> the "CSR detection" just to support all the zeros and illegal CSRs
> took a whopping 15% of an ICE40 FPGA!)
> 
> in RISC-V they get this right, by having two separate Platforms:
> 
> * Embedded which is permitted to ignore reserved bits entirely
> 
> * UNIX, which definitely is not.
> 
> for Embedded, the vendor customises the firmware entirely, and binary
> interoperability as well as legacy software support is completely
> unimportant.
> 
> for UNIXen, interoperability and longterm stability we know very well
> is critical.
> 
> bottom line if it is correct that on the PowerISA UNIX Platform
> reserved bits can be ignored that is cause for some concern, where
> for Embedded it would be the other way round: cause for concern if
> the reserved bits could *not* be ignored.

Do you have more specific concerns here ? IE. Actual examples where
this has been cause of breakage in the past ?

> > > however this is so unclear (because of the referral from one
> > section
> > > to another) that i am seeking confirmation.  should we raise an
> > > "illegal instruction" when "LEV > 1" on sc?
> > 
> > Section 1.8.2 (Book I) says "any attempt to execute an invalid form
> > of
> > an instruction will either cause the system illegal instruction
> > handler to be invoked or yield boundedly undefined results". 
> > Putting
> > LEV=1 in sc would be an example of an invalid form (on an
> > implementation without hypervisor mode).
> 
> ok that helps clarify what that means, thank you.
>  
> >   A boundedly undefined result
> > is one which could be obtained by a sequence of valid instructions,
> > so in the case of sc 1, making it do what sc 0 does meets the
> > boundedly undefined results requirement.
> 
> ok so that... if i am understanding correctly, means, "you can in
> fact do something different and OS software has to detect it and sort
> it out to yield expected behaviour"
>  
> which, if i am being honest, makes me nervous :)
> 
> > > secondly, we note that "LEV=1" is for invocation of the
> > hypervisor.
> > > what's not clear to us is - given that we are not implementing
> > > hypervisor - should this be *also* treated as an illegal
> > instruction?
> > > or, should we just leave it to fall through to trap @ addr
> > 0x0c00, and
> > > expect the trap *there* to notice and deal with the situation?
> > 
> > That is what I would do.
> 
> ok.  we can do that.
>  
> > There is one of the variants of KVM on PPC, called KVM-PR, which
> > runs
> > the guest entirely in user mode and traps and emulates all
> > privileged
> > instructions (thus it doesn't need hypervisor mode and can run
> > inside
> > a guest of another hypervisor).  If you are running a KVM guest
> > inside
> > that environment and the guest does sc 1, KVM-PR expects that to
> > end
> > up at the kernel's 0xc00 handler.  So that is one reason to treat
> > sc 1
> > as sc 0.
> 
> ahh.  i did wonder :)
>  
> > > also: if we set the HV bit in MSR (when LEV=1) section 6.5.14
> > p1077
> > > which refers us back to figure 65 on p1064, will this "break"
> > things?
> > 
> > Probably not.  Linux does check whether HV=1 at boot time, but I'm
> > pretty sure that's only on certain processors which it knows to be
> > HV-capable (either by looking at PVR or the device tree).
> 
> ok.  thank you.
>  
> > > also: in microwatt, i'm not seeing the remaining bits which
> > appear [to
> > > need to] be set.
> > > 
> > > 
> > https://github.com/antonblanchard/microwatt/blob/master/execute1.vhdl#L479
> > >             ctrl_tmp.msr(MSR_SF) <= '1';
> > >             ctrl_tmp.msr(MSR_EE) <= '0';
> > >             ctrl_tmp.msr(MSR_PR) <= '0';
> > >             ctrl_tmp.msr(MSR_IR) <= '0';
> > >             ctrl_tmp.msr(MSR_DR) <= '0';
> > >             ctrl_tmp.msr(MSR_RI) <= '0';
> > >             ctrl_tmp.msr(MSR_LE) <= '1';
> > > 
> > > these appear to be correct as defined according to figure 65
> > (p1063)
> > > 
> > > however the remaining actions do not seem to be implemented
> > (p1064):
> > > 
> > >      Bits bit 5, TM, VEC, VSX, PR, FP, and PMM are set to 0.
> > >      The TE field is set to 0b00.
> > >      TM, FP, VEC, VSX, and bit 5 are set to 0.
> > 
> > Right.  We have a to-do list for architecture compliance.  (We
> > haven't
> > implemented 32-bit mode or BE mode, for instance.)
> 
> yeahh although we have 32 bit op modes (using microwatt 
> decode1.vhdl, turned into CSV) we have yet to support the MSR 32bit
> global mode.
> 
> LE/BE amazingly seems to work on LibreSOC, it was quite funny having
> the trap jump into 0x700 when testing against qemu (running
> singlestep under gdb), only to find that qemu traps change the LE bit
> and of course in qemu once that's changed gdb can't read registers
> correctly. sigh.

It can if you manually change endian in gdb no ?

> > > question: what effect would it have - bear in mind that we are
> > > following microwatt - if we implemented these changes to MSR? 
> > bear in
> > > mind that we ignore most of them at the moment (MSR.LE being one
> > > notable exception), so the question is, in effect: does the Linux
> > > kernel *also* ignore them?
> > 
> > The Linux kernel clearly needs PR to be set to zero and it also
> > expects FP, VEC, VSX, TM to be cleared.  Setting TE to 0 is
> > necessary
> > once you implement the trace interrupt, otherwise you could get a
> > trace interrupt inside your first-level interrupt handlers, which
> > would be bad.
> 
> ah :)
>  
> >  Similarly if you have floating-point and you don't set
> > FE0 and FE1 to 0 on an interrupt, there is the chance of taking a
> > floating-point program interrupt inside a first-level handler.
> 
> whoops.  ok appreciate the warning.
>  
> > I'm not sure that all this counts as the Linux kernel "ignoring"
> > the
> > bits, but in general if you do what the architecture says, the
> > kernel
> > will be happier than if you don't.
> 
> ha, that makes sense.
> 
> i generally found this out when network reverse-engineering, despite
> not understanding at all what i was sending to the client or server
> :)
> 
> thank you Paul
> 
> l.
> 
> 
> 
> _______________________________________________
> OpenPOWER-HDL-Cores mailing list
> OpenPOWER-HDL-Cores at mailinglist.openpowerfoundation.org
> 
http://lists.mailinglist.openpowerfoundation.org/mailman/listinfo/openpower-hdl-cores




More information about the libre-riscv-dev mailing list