[libre-riscv-dev] SV auto-width
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Wed Jan 9 04:34:08 GMT 2019
On Wednesday, January 9, 2019, Jacob Lifshay <programmerjake at gmail.com>
wrote:
> We will still need a way to convert between vectors with different element
> sizes that's faster than ld/st.
Ah good catch.
I wonder if that means leaving in elwidth adjustment in twin predication
cases, MV FCVT LD ST etc as the engine is different behaviour there anyway.
It is single predication scenarios on standard arithmetic ops that is
problematic, as the auto width conversion rules are heuristical, and break
down for special ops such as 17 bit add with 1 bit shift, typically used in
audio processing to avoid loss of precision when adding 2 16 bit numbers.
The twin predication case with elwidth override will be challenging to
parallelise, it covers things like VINSERT, VSCATTER, VGATHER, the pattern
(algorithm) in instruction issue covers (applies to) not just MV, which
generates the vector ops mentioned, it also applies to FCVT and LD and ST.
I believe it may be ok to slow vector op issue down to sequential ops in
the case where elwidth is different for target and src. It is unusual
enough as it is.
Jacob are there any scenarios where only a single issue nondefault elwidth
vector conversion would become a serious bottleneck?
Across all 4 cores do we need 12 billion per second 32 bit to 16 bit FCVT
for example? Would 3.2 GOPs per second be sufficient (800mhz per core,
single issue).
L.
--
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
More information about the libre-riscv-dev
mailing list