[libre-riscv-dev] microwatt decoder tables: M-Form and X-Form switched RS and RB

Tue Jun 2 01:01:53 BST 2020

On Tuesday, June 2, 2020, Paul Mackerras <paulus at ozlabs.org> wrote:

>
> I'm not quite following, sorry.  When you talk about the second or
> third position or field, are you talking about where they are in the
> instruction word (and if so, are you counting left to right, or right
> to left),

there is the field names RA, RB, RC, RS, RT

there is the order (left to right) in which these occur in an X-Form,
M-Form etc.

then, in microwatt, there are 3 input regs, a/b/c.

these do not match up consistently, neither across Forms, nor in microwatt
input reg ordering when it comes to RA, RS and reg a and reg c.

* OP_EXTS for example places RS into microwatt reg c

> or are you talking about the input_reg_a/b/c entries in
> microwatt's decode tables?
>
> If you're talking about positions in the instruction word, then either
> I'm misunderstanding your paragraph about X-form above, or it is
> incorrect, because instructions such as "and" and "or" take their
> inputs from RS and RB, so op1 is not RA for them, though it is for the
> arithmetic instructions.

yet OP_EXTS, which is also arithmetic, take inputs ftom RS but place it
into the *third* microwatt register, c, despite microwatt positions 1 and 2
(reg a and reg b) being empty.

in the intervening time (good morning to you, btw) i worked out a couple of
things:

* i was wrong about fixedshift operations: the way that microwatt allocates
them to internal reg a/b/c is very sensible.

  these cannot be reordered.

* anything from fixedlogical (and, or, xor), fixedarith (specifically exts)
and mtcrf, these all use RS which goes into microwatt reg c, *all* whilst
also having microwatt reg a NONE.

  consequently these *can* be swapped.  whereever in decode1.vhdl there is
"NONE, NONE, RS" this can be replaced "RS, NONE, NONE" and reg a used
instead of reg c.

  whilst this has no significance at the moment, because everything is
primarily global in nature (in execute1.vhdl), if all fixedarith ALU
functions were moved to a separate module, that module would require
*three* 63 bit paths (reg a b c) where only 2 of those are ever active at
any one time.

by making the change above, the hypothetical fixedarith.vhdl module needs
only 2 input operands, not 3.

> (I asked some of the original Power architects why there was this
> nonuniformity in the Power ISA where rotate and logical instructions
> use RS and RB as their inputs, whereas arithmetic instructions use RA
> and RB as inputs.  The answer was that in the Power1 implementation,
> they were very short of gates, so they made the logic for rotating and
> masking data for store instructions also serve as the logic for doing
> the rotate and mask instructions.

ooo efficient microcoding.  niiice.  and ah now i see why the masking
exists: it would have been used to perform the insertion of (or extraction
of) offset bytes into or out of a LD/ST operand.

i love it.

Since stores take their data from
> RS, that meant that the rotate and shift instructions also took the
> data to be rotated or shifted from RS.)

i like it.  the paths would be straight: no crossover MUXes to get RS to 2
different ports.

i seriously considered making the LD/ST Function Unit also capable of ADD
operations.  the add is there, you still need an immediate, so why not.

the reason why not: ADD in POWER9, by being broken into carry, negation
etc, would have added quite a lot of logic to LD/ST.  and be able to
read/write carry, and CR0 and it gets too much.

i may however still get it to do "straight" (non carry, non invert) ADD and
ADDI.

l.

-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68