[libre-riscv-dev] PPC on Talos and Playstation 3

Luke Kenneth Casson Leighton lkcl at lkcl.net
Mon Mar 30 20:20:15 BST 2020

On Mon, Mar 30, 2020 at 8:01 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
> On Mon, Mar 30, 2020, 11:36 Immanuel, Yehowshua U <yimmanuel3 at gatech.edu>
> wrote:
> > Before I try my hand at an FPU,
> >
> > Some questions:
> >  - Do we have any FPU attempt in nMigen, Verilog, or some other language
> > yet that I should be aware of?
> >
> yes, https://git.libre-riscv.org/?p=ieee754fpu.git;a=summary

sorry, missed this question.  yehowshua: we have a *95% completed*
IEEE754 FPU already.  we don't need another one.  we _do_ however need
a formal proof of the one that we already have, plus adding POWER
rounding modes to the one that we already have.

> I was assuming we'd just implement them both as fully IEEE 754 compliant,
> unless the noncompliant mode mandates something incompatible with IEEE 754
> such as flushing denormals to zero.

have to check.  i wasn't aware the mode existed until 5 minutes ago.  3.0B p126

bit 61:

Floating-Point Non-IEEE Mode (NI)
Floating-point non-IEEE mode is optional. If
floating-point non-IEEE mode is not imple-
mented, this bit is treated as reserved, and the
remainder of the definition of this bit does not
If floating-point non-IEEE mode is imple-
mented, this bit has the following meaning.
0 The processor is not in floating-point
     non-IEEE mode (i.e., all floating-point
     operations conform to the IEEE standard).
1 The processor is in floating-point
     non-IEEE mode.

so it's optional, thank goodness.

> 4x16-bit, 2x32-bit, 64-bit. We're planning on having 2 64-bit units. The
> FPU should be designed to reuse the reconfigurable integer multiplier since
> that's most of the logic in fpmul or fma.

yes.  this means:

* splitting the FPU pipeline into pre- and post- processing pipelines
* feeding the (partial) results from the FP pre-processing pipeline *back*
  into (side-channel, non-Dependency-tracked) Function Units
* those request an InternalOp "MUL" (integer MUL) and mark the results
  as belonging to an FP operation
* such that when the INT MUL finishes, the results get pushed back
  to the Function Units again, now into the FINAL stage..
* which is to perform the FPU post-processing pipeline

so the data goes THREE TIMEs through the Function Units, going through
THREE separate pipelines.

> Also, the FP divider should also be shared with integer div/rem.

... and therefore the exact same trick as above will be needed.

interestingly this means that, hypothetically, we can do integer SQRT
and ineger RSQRT.  the hardware exists: it's just that there's no
opcodes in POWER to get access to them.


More information about the libre-riscv-dev mailing list