[libre-riscv-dev] Some general instruction ideas
whygee at f-cpu.org
whygee at f-cpu.org
Fri Jan 24 15:03:12 GMT 2020
On 2020-01-24 14:44, Lauri Kasanen wrote:
> Hi,
>
> I wanted to float some general instruction ideas, now that things seem
> to be picking up. I've mentioned some to Luke previously. These have
> use in video, but also in general.
>
> - altivec's vec_perm. It's a byte shuffle with three input regs and one
> output. It's exceedingly useful, more powerful than any of x86's
> shuffles, and I believe it should be copied as-is.
very powerful indeed !
> - saturated versions of add/sub/mul/narrow/etc. Saves those manual
> checks. Beyond video, often used in image processing.
and sound processing, and many other fields...
> - memcpy. I remember a Linus quote on what instructions he'd like to
> see, and he said memcpy and memset. I know it's not very RISC, but it's
> highly useful, and a hardware loop is always faster than sw.
I'm not sure...
is it something like the STOS(B|W|D|Q) / LODS(B|W|D|Q) instructions of
x86 ?
so basically a store or load with index inc/decrement in parallel ?
It IS possible in RISC because "canonical MIPS" performs the ADD update
in series with the load/store. Since F-CPU, my architectures work like
that
as well : you have to have the destination/source memory address in a
register
and the instruction computes the address for the NEXT access in
/parallel/.
So I'd say yay.
> One cpu I'm familiar with, 65816, has two such memcpy instructions
> (forward and backward, byte units). They have no hidden state, and are
> interruptable between every byte. By giving them overlapping areas,
> they work as memset, with an arbitrary-sized source pattern, not just
> limited to a byte or int.
>
> Such would have immediate use in decompression, and widely in general.
compression/decompression (and others) need specific instruction for
bitstream insertion/extraction, which is quite a delicate subject
(I'm preparing/writing an article about this right now for Linux
Magazing France)
> It would also give a clear answer for "what is the fastest memcpy",
> heh.
it will always be optimised microcoded versions, at least for medium to
large blocks.
> - Lauri
yg
More information about the libre-riscv-dev
mailing list