[libre-riscv-dev] [Bug 173] dynamic partitioned "shift"
bugzilla-daemon at libre-riscv.org
bugzilla-daemon at libre-riscv.org
Wed Feb 12 17:22:17 GMT 2020
http://bugs.libre-riscv.org/show_bug.cgi?id=173
--- Comment #5 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Michael Nolan from comment #3)
> I've got good news and bad news. The good news is I have a dynamic shifter
> working in part_shift/part_shift_dynamic.py.
superb.
> The bad news is it works out to
> 8500 gates bits to do a 64 bit, 8 partition shifter (ouch!).
w00t! :)
> I currently have a gigantic switch statement to select which row of the
> matrix is needed for each element (similar to the old part_cmp), so
> optimizing this would probably help greatly with the number of gates needed.
the yosys graph is wonderful: like a dog munched your spaghetti bolognaise
dinner then threw it up when you held the dog up at waist-level and did a
nice pirouette :)
what would reduce the number of gates significantly is getting the size
of b down, in the matrices. see "question: b1" here:
https://libre-riscv.org/3d_gpu/architecture/dynamic_simd/shift/
if you look just at the last... err... column (where you are creating
the outputs that you know are going to go into the MSB-partition
(MSP? Most Significant Partition?)
in that case, you know for a fact that shifting by greater than the width
of that partition (let's say it's 8 bits wide), you *know* that anything
that shifts by greater than 8 is always, always, always going to result
in that "contribution" from the shift-matrix being zero.
therefore, what you can do is *truncate* the B-input to 3 bits!
but... butbutbut, that's not quite the full story, because actually
what you want is
MIN(B[partitionbits], 8).
NOT, repeat, NOT:
B[partitionbits] & (0b111) # (8-1 in binary == 0b111 mask)
even when you are dealing with the LSB-partition (LSP? Least
Significant Partition), you definitely do not want "A << B",
you want "A << B[log2(len(output))-1:0]"
"A<<B" will try to generate 64-64 shifters.
A<< B[trunc] will generate (correctly) 64-7 shifters, because anything greater
than 0b1000000 is *equivalent* to 0b1000000.
some shifters take 6 bits (not 7) and assume that 0b000000 actually means
"0b1000000", others will add 1 to the B operand...
... let's not worry about that for now, though, and focus on getting
the B inputs truncated somewhat?
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-riscv-dev
mailing list