[libre-riscv-dev] microwatt decoder tables: M-Form and X-Form switched RS and RB
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Tue Jun 2 01:01:53 BST 2020
On Tuesday, June 2, 2020, Paul Mackerras <paulus at ozlabs.org> wrote:
> I'm not quite following, sorry. When you talk about the second or
> third position or field, are you talking about where they are in the
> instruction word (and if so, are you counting left to right, or right
> to left),
there is the field names RA, RB, RC, RS, RT
there is the order (left to right) in which these occur in an X-Form,
then, in microwatt, there are 3 input regs, a/b/c.
these do not match up consistently, neither across Forms, nor in microwatt
input reg ordering when it comes to RA, RS and reg a and reg c.
* OP_EXTS for example places RS into microwatt reg c
> or are you talking about the input_reg_a/b/c entries in
> microwatt's decode tables?
> If you're talking about positions in the instruction word, then either
> I'm misunderstanding your paragraph about X-form above, or it is
> incorrect, because instructions such as "and" and "or" take their
> inputs from RS and RB, so op1 is not RA for them, though it is for the
> arithmetic instructions.
yet OP_EXTS, which is also arithmetic, take inputs ftom RS but place it
into the *third* microwatt register, c, despite microwatt positions 1 and 2
(reg a and reg b) being empty.
in the intervening time (good morning to you, btw) i worked out a couple of
* i was wrong about fixedshift operations: the way that microwatt allocates
them to internal reg a/b/c is very sensible.
these cannot be reordered.
* anything from fixedlogical (and, or, xor), fixedarith (specifically exts)
and mtcrf, these all use RS which goes into microwatt reg c, *all* whilst
also having microwatt reg a NONE.
consequently these *can* be swapped. whereever in decode1.vhdl there is
"NONE, NONE, RS" this can be replaced "RS, NONE, NONE" and reg a used
instead of reg c.
whilst this has no significance at the moment, because everything is
primarily global in nature (in execute1.vhdl), if all fixedarith ALU
functions were moved to a separate module, that module would require
*three* 63 bit paths (reg a b c) where only 2 of those are ever active at
any one time.
by making the change above, the hypothetical fixedarith.vhdl module needs
only 2 input operands, not 3.
> (I asked some of the original Power architects why there was this
> nonuniformity in the Power ISA where rotate and logical instructions
> use RS and RB as their inputs, whereas arithmetic instructions use RA
> and RB as inputs. The answer was that in the Power1 implementation,
> they were very short of gates, so they made the logic for rotating and
> masking data for store instructions also serve as the logic for doing
> the rotate and mask instructions.
ooo efficient microcoding. niiice. and ah now i see why the masking
exists: it would have been used to perform the insertion of (or extraction
of) offset bytes into or out of a LD/ST operand.
i love it.
Since stores take their data from
> RS, that meant that the rotate and shift instructions also took the
> data to be rotated or shifted from RS.)
i like it. the paths would be straight: no crossover MUXes to get RS to 2
i seriously considered making the LD/ST Function Unit also capable of ADD
operations. the add is there, you still need an immediate, so why not.
the reason why not: ADD in POWER9, by being broken into carry, negation
etc, would have added quite a lot of logic to LD/ST. and be able to
read/write carry, and CR0 and it gets too much.
i may however still get it to do "straight" (non carry, non invert) ADD and
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
More information about the libre-riscv-dev