[libre-riscv-dev] daily kan-ban update 07jul2020

Luke Kenneth Casson Leighton lkcl at lkcl.net
Tue Jul 7 18:37:59 BST 2020


---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Tue, Jul 7, 2020 at 6:06 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Tue, Jul 7, 2020, 04:30 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
>
> >
> > https://bugs.libre-soc.org/showdependencytree.cgi?maxdepth=1&id=383&hide_resolved=1
> >
> > yesterday:
> >
> > * implemented a (very basic) MUL pipeline, started creating unit tests
> >
>
> The partitioned multiplier I had written should implement all the ops you
> need other than madd*, though you'd need to either revert or finish off all
> the half-completed refactorings.

i really want to use it - particularly if it turns out that it'll
reduce the number of gates.  a 128-bit-wide MUL is *MASSIVE*.  20,000
gates taking up 1.6 mm^2 all on its own.

however we're so pressed for time that i'm cutting back anything other
than the absolute essentials.  if someone else came forward and helped
out it would be possible to do this.


> for the madd* instructions, you would need to sum the third register and
> mul.intermediate_output using a 128-bit adder with the register sign or
> zero extended as appropriate, then take the lower or upper 64-bits as
> indicated by the specific instruction.

ahh i'd forgotten about madd and friends.

i've got the pre- and post- analysis phases done:
https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/fu/mul/pre_stage.py;hb=HEAD
https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/fu/mul/post_stage.py;hb=HEAD

these are basically "microwatt" (again) and it works well.  except,
microwatt doesn't implement madd.

> a 64*64+64->128-bit madd won't ever overflow the 128 bits for both signed
> and unsigned multiplication.

yehyeh, that makes sense.  can you recall if the overflow detection is
on the top 64 bits for madd?  for other 64-bit muls it is.

> I'll be adding the mul* and madd* instructions to
> power-instruction-analyzer shortly.

ah fantastic, because i could do with seeing what works and what
doesn't, confirmed against POWER9.  all the other source code i've
looked at... it's just ridiculously obtuse or just not up-to-date.
the sail correctness proof is incorrect: qemu it took me 30 minutes to
*find* the mullw implementation (and even then it wasn't an actual
multiply, it was a call to a JIT engine), and pearpc and dolphin are
both too old to trust.

sigh :)

l.



More information about the libre-riscv-dev mailing list