[libre-riscv-dev] multiplier 8x8 products
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Fri Aug 23 22:11:14 BST 2019
The partitioned multiplier hits every cycle with a massive 64 8x8
multiplies. Can you think of a way to reduce that?
Also, I have been looking at the Dadda Tree algorithm, it is amazingly
However, adapting it to be early out capable has me slightly puzzled.
Here is what I am thinking: having a suite of 8x8 multiplies (as straight
DSP blocks) that would go directly out (early), but for 16 bit values those
8x8 products would go into a Dadda tree that produced 16 bit outputs, again
Then those would *again* be added to yet more of the 8x8 products, another
suite of Dadda Trees, this time to create a 32 bit mul.
Finally the 64 bit phase.
It would not be as efficient as a dedicated Dadda 64 bit Mul, because at
the 16, 32 and 64 phases a full 128 bit adder is needed.
A straight 64x64 Dadda would only need the one full 128 bit adder.
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
More information about the libre-riscv-dev