[libre-riscv-dev] KCP53000B micro-architecture thoughts

Thu May 30 02:13:43 BST 2019

On Wed, May 29, 2019 at 3:48 PM Luke Kenneth Casson Leighton
<lkcl at lkcl.net> wrote:
>  sure.  so, i read it all, and i believe you may have missed something
> quite subtle about the augmented 6600 scoreboard.

To be fair, I wasn't considering the 6600 design.  I was designing
from first principles, as that's the only way I can keep things
straight in my head.  Regrettably, I still cannot figure out Mitch's
or Thornton/Cray's designs.  I've pretty much given up trying at this
point, content in the knowledge that they're just too complex for my
mind to grasp.

>  exceptions are handled very very simply: by hooking (and preventing)
> commits, called "shadowing".  the diagram is on p55 of section
> 11.5.1.1 and i have a modified version here:
> https://libre-riscv.org/3d_gpu/6600scoreboard/

This is the job of ABORT in my design.  Most instructions execute
under the speculative assumption that there'll be no exceptions
generated (which is the normal use-case) by any previous instructions
issued.  Only when an exceptional condition is detected will ABORT
trigger, causing all other FUs to reset to their quiet states.

>  so the whole concept of stalling, or of having "history" and having
> to "restore" things, or inhibit things, is all kinda turned around.

My view when thinking about this architecture was to just discard
everything in the FUs, since the exception handling code will result
in it all being re-run again anyway.

>  the normal way to do interrupts is to simply stop issuing
> instructions, wait for all FUs to write their results, and once
> everything is quietened down (all FUs no longer active), you begin the
> exception/interrupt handler.

Interrupts still require transitioning the machine state to an
elevated privilege level though.  I figured having a dedicated
interrupt unit would be over-kill; but it is at least an interesting
exercise to be able to represent interrupts as something that can be
treated in a FU in this architecture.

The CDC had the advantage that the I/O processors dealt with system
calls and the like (IIRC, the CDC 6600 operating system did not run on
the main processor, but on I/O processor 0), which took a lot of the
state transition and protection burdens off the main processor.

> the Priority Picker is, in nmigen terminology, basically a
> back-to-back PriorityEncoder and Decoder.  unary in, unary out.
> multiple bits of the unary vector can be set, however ONLY ONE OUTPUT
> BIT WILL BE SET.

Yes; it's the same basic logic that drives any other bus arbiter.  I
was also planning on exploring arbiters with guaranteed fairness as
well, to see how that affected performance.  (Just a curiosity of
mine.)

> mstatus and other CSRs, i am giving serious consideration to having a
> special Dependency Matrix dedicated just to them.  i.e. to treat CSRs
> *as another register file*.

I'm planning on representing the entire CSR space as a single
dependency.  Since an instruction can touch only one CSR at a time,
this seems reasonable.  Other operations on CSRs (e.g., as when traps
cause mstatus, mepc, mcause, et. al. all to be updated at once) seem
to happen outside the normal instruction execution flow (e.g., either
as an exceptional condition in an instruction, or in between
instructions as an interrupt), and so I was going to represent those
as 1-hot event triggers, which would cause bulk state transitions in
the subsequent clock cycle.  Any time a bulk update like that happens,
it is always accompanied by a privilege escalation, so I just ABORT
all pending FUs (hence why ABORT is just the logical OR of all the
event triggers from all the FUs).

-- 
Samuel A. Falvo II