[libre-riscv-dev] multiplier 8x8 products

Luke Kenneth Casson Leighton lkcl at lkcl.net
Fri Aug 23 22:11:14 BST 2019

Hi Jacob,

The partitioned multiplier hits every cycle with a massive 64 8x8
multiplies. Can you think of a way to reduce that?

Also, I have been looking at the Dadda Tree algorithm, it is amazingly

However, adapting it to be early out capable has me slightly puzzled.

Here is what I am thinking: having a suite of 8x8 multiplies (as straight
DSP blocks) that would go directly out (early), but for 16 bit values those
8x8 products would go into a Dadda tree that produced 16 bit outputs, again
early out.

Then those would *again* be added to yet more of the 8x8 products, another
suite of Dadda Trees, this time to create a 32 bit mul.

Finally the 64 bit phase.

It would not be as efficient as a dedicated Dadda 64 bit Mul, because at
the 16, 32 and 64 phases a full 128 bit adder is needed.

A straight 64x64 Dadda would only need the one full 128 bit adder.



crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

More information about the libre-riscv-dev mailing list