[Libre-soc-dev] v3.1B prefix

Luke Kenneth Casson Leighton lkcl at lkcl.net
Mon Dec 14 07:18:13 GMT 2020


On 12/14/20, Alexandre Oliva <oliva at gnu.org> wrote:

> 32-bit uncompressed instructions: 109169
> 16-bit compressed instructions: 7732
> 16-imm compressed-mode instructions: 14509
> 10-bit compressed instructions: 4057
>
> 10-bit mode-switching nops: 3398
> 10-bit mode-switching nops for imm-16: 10821
> 16-bit mode-switching nops after imm-16: 1876
>
> 10-bit nop+16-imm pairs above, backtracked to 32-bit: 9171
>
> Compressed size estimate: 521462
> Original size: 541868
> Compressed/original ratio: 0.962341

drat.  ah btw was that with the better reg nums i found ?  i used
insn-histogram.py and it gave about a 3% improvement.  just committed
although i just realised that the 4 reg versions need to be on the
best 4 *relevant* regs not the global top 4 regs.

> Nearly 20% of the instructions can be compressed,

interesting.  that would be great if it could be met all rhe time.

> lost because of the need for explicit mode switching.  We just don't
> have enough compressible insn density to get much better than
> break-even.

that was always the concern with the versions that used the 10bit to
do a "countdown" (next N ops are 16bit)

what's the statistics on the number of consecutive Compressed opportunities?

if the number of consecutive C ops averages above 4 then it's worth
wasting 1 16bit slot just to set 16bit mode for up to N instructions.



> But then again, there are too many uncompressible insns among them to
> make the mode switching worth it.

damn damn damn i suspect this is why VLE invented an entire new encoding.


> My hunch is that we'd have to more than double the percentage of
> compressible insns to make compressed mode shine.

this sounds about right.

i have an idea though.

what if we said that there were 4 bits per reg (16 possibles), and
that the transition to 16bit was a "countdown" of up to 8 ops?

or, because there are 10 bits available, how about 3 3 3 so:

* up to 8 16bit ops
* up to 8 32bit ops
* up to 8 16bit then finally drop out

the 2 bits currently taken with N and M can be assumed to be allocated
to increasing 3bit reg#s to 4 bit for now.

yes it is not perfect, some ops are src1 src2 dest, they can't all be 4 bit

l.



More information about the Libre-soc-dev mailing list