[libre-riscv-dev] PPC on Talos and Playstation 3
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Mar 30 20:20:15 BST 2020
On Mon, Mar 30, 2020 at 8:01 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Mon, Mar 30, 2020, 11:36 Immanuel, Yehowshua U <yimmanuel3 at gatech.edu>
> wrote:
>
> > Before I try my hand at an FPU,
> >
> > Some questions:
> > - Do we have any FPU attempt in nMigen, Verilog, or some other language
> > yet that I should be aware of?
> >
>
> yes, https://git.libre-riscv.org/?p=ieee754fpu.git;a=summary
sorry, missed this question. yehowshua: we have a *95% completed*
IEEE754 FPU already. we don't need another one. we _do_ however need
a formal proof of the one that we already have, plus adding POWER
rounding modes to the one that we already have.
> I was assuming we'd just implement them both as fully IEEE 754 compliant,
> unless the noncompliant mode mandates something incompatible with IEEE 754
> such as flushing denormals to zero.
have to check. i wasn't aware the mode existed until 5 minutes ago. 3.0B p126
bit 61:
Floating-Point Non-IEEE Mode (NI)
Floating-point non-IEEE mode is optional. If
floating-point non-IEEE mode is not imple-
mented, this bit is treated as reserved, and the
remainder of the definition of this bit does not
apply.
If floating-point non-IEEE mode is imple-
mented, this bit has the following meaning.
0 The processor is not in floating-point
non-IEEE mode (i.e., all floating-point
operations conform to the IEEE standard).
1 The processor is in floating-point
non-IEEE mode.
so it's optional, thank goodness.
> 4x16-bit, 2x32-bit, 64-bit. We're planning on having 2 64-bit units. The
> FPU should be designed to reuse the reconfigurable integer multiplier since
> that's most of the logic in fpmul or fma.
yes. this means:
* splitting the FPU pipeline into pre- and post- processing pipelines
* feeding the (partial) results from the FP pre-processing pipeline *back*
into (side-channel, non-Dependency-tracked) Function Units
* those request an InternalOp "MUL" (integer MUL) and mark the results
as belonging to an FP operation
* such that when the INT MUL finishes, the results get pushed back
to the Function Units again, now into the FINAL stage..
* which is to perform the FPU post-processing pipeline
so the data goes THREE TIMEs through the Function Units, going through
THREE separate pipelines.
> Also, the FP divider should also be shared with integer div/rem.
... and therefore the exact same trick as above will be needed.
interestingly this means that, hypothetically, we can do integer SQRT
and ineger RSQRT. the hardware exists: it's just that there's no
opcodes in POWER to get access to them.
l.
More information about the libre-riscv-dev
mailing list