[libre-riscv-dev] [Bug 173] dynamic partitioned "shift"

bugzilla-daemon at libre-riscv.org bugzilla-daemon at libre-riscv.org
Wed Feb 12 17:22:17 GMT 2020


http://bugs.libre-riscv.org/show_bug.cgi?id=173

--- Comment #5 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Michael Nolan from comment #3)
> I've got good news and bad news. The good news is I have a dynamic shifter
> working in part_shift/part_shift_dynamic.py. 

superb.

> The bad news is it works out to
> 8500 gates bits to do a 64 bit, 8 partition shifter (ouch!).

w00t! :)

> I currently have a gigantic switch statement to select which row of the
> matrix is needed for each element (similar to the old part_cmp), so
> optimizing this would probably help greatly with the number of gates needed.

the yosys graph is wonderful: like a dog munched your spaghetti bolognaise
dinner then threw it up when you held the dog up at waist-level and did a
nice pirouette :)

what would reduce the number of gates significantly is getting the size
of b down, in the matrices.  see "question: b1" here:
https://libre-riscv.org/3d_gpu/architecture/dynamic_simd/shift/

if you look just at the last... err... column (where you are creating
the outputs that you know are going to go into the MSB-partition
(MSP? Most Significant Partition?)

in that case, you know for a fact that shifting by greater than the width
of that partition (let's say it's 8 bits wide), you *know* that anything
that shifts by greater than 8 is always, always, always going to result
in that "contribution" from the shift-matrix being zero.

therefore, what you can do is *truncate* the B-input to 3 bits!

but... butbutbut, that's not quite the full story, because actually
what you want is

    MIN(B[partitionbits], 8).

NOT, repeat, NOT:

    B[partitionbits] & (0b111) # (8-1 in binary == 0b111 mask)

even when you are dealing with the LSB-partition (LSP? Least
Significant Partition), you definitely do not want "A << B",
you want "A << B[log2(len(output))-1:0]"

"A<<B" will try to generate 64-64 shifters.

A<< B[trunc] will generate (correctly) 64-7 shifters, because anything greater
than 0b1000000 is *equivalent* to 0b1000000.

some shifters take 6 bits (not 7) and assume that 0b000000 actually means
"0b1000000", others will add 1 to the B operand...

... let's not worry about that for now, though, and focus on getting
the B inputs truncated somewhat?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-riscv-dev mailing list