[Libre-soc-bugs] [Bug 864] implement parallel prefix reduction in simulator
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Sun Jun 26 09:30:20 BST 2022
https://bugs.libre-soc.org/show_bug.cgi?id=864
--- Comment #4 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #3)
> (In reply to Jacob Lifshay from comment #2)
> > try testing with a predicate of [0,0,0,0,1,1,1,1], you'll see that element 0
> > (the expected output by the compiler) never gets written to,
>
> vec = [1, 2, 3, 4, 9, 5, 6, 8]
> prd = [0, 0, 0, 0, 1, 1, 1, 1]
>
> output: [1, 2, 3, 4, 28, 5, 14, 8]
>
> that's expected behaviour. the first four are masked-out, and
> do not get altered. the sum ends up in the first non-masked-out element
> (element 4) 9+5+6+8=28
the result not being in a statically predicable location renders predicated
reduction *all but useless* from a compiler perspective -- the compiler needs
to know which *fixed scalar* register to read the result from, the register
can't jump around when the predicate changes.
(unpredicated reduction is unaffected)
since you obviously don't like my proposed solution (moving is imho the best
fully correct solution -- not all operations have an identity, so substituting
masked-off lanes with the identity element doesn't really work), please propose
some yourself.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-soc-bugs
mailing list