[libre-riscv-dev] pipeline stages controlling delays

Mon Apr 8 02:27:41 BST 2019

jacob i managed to get BreakReadyChainStage down to six lines:

        self.m.d.comb += self.n.o_valid.eq(buf_full | p_i_valid)
        self.m.d.comb += self.p._o_ready.eq(~buf_full)
        self.m.d.sync += buf_full.eq(~self.n.i_ready_test & self.n.o_valid)

        odata = Mux(buf_full, buf, self.stage.process(self.p.i_data))
        self.m.d.comb += eq(self.n.o_data, odata)
        self.m.d.sync += eq(buf, self.n.o_data)

here's the original:

        m.d.sync += self.buffer_full.eq(~self.succ.ready_in
                                        & (self.pred.valid_in
                                           | self.buffer_full))
        m.d.comb += self.succ.valid_out.eq(self.buffer_full
                                           | self.pred.valid_in)
        m.d.comb += self.pred.ready_out.eq(~self.buffer_full)

        def visitor(name: str,
                    pred_tuple: Tuple[Signal, Direction],
                    succ_tuple: Tuple[Signal, Direction],
                    buffer_tuple: Tuple[Signal, Direction]) -> None:
            pred = pred_tuple[0]
            succ = succ_tuple[0]
            buffer = buffer_tuple[0]
            m.d.comb += succ.eq(Mux(self.buffer_full, buffer, pred))
            m.d.sync += buffer.eq(succ)

        visit_records(visitor,
                      [self.pred.data_in,
                       self.succ.data_out,
                       self.buffer])

odata is just a python temporary variable, eq replaces the use of
visitor, and setting valid_out to (buffer_full | valid_i) is the exact
same logic used in the sync block to set buffer_full... so rather than
do that logic twice, i just passed in valid_out in its place.

i particularly like how UnbufferedPipeline2 (BreakReadyChainStage)
does the ready/valid computation in *parallel* with the data creation.
if you look at the graphviz of UnbufferedPipeline, it's clearly a
dependency (pv)

even more fascinating is that the algorithm i took last night from
zipcpu hbdeword (which is the algorithm deployed in Wishbone and
AXI4), to create BufferedPipeline2, there's *no loops at all* except
on the latches for r_busy and the result.

i see no reason to keep UnbufferedPipeline around, what do you think?

we also need way, way better names for these *Pipeline classes, as
they're not actual pipelines: they're a component *of* a Pipeline (a
Control handler).  only the combination of a stage (Data Producer)
*with* these things (a Control handler) does a pipeline actually
result.

l.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2019-04-08_02-01.png
Type: image/png
Size: 87255 bytes
Desc: not available
URL: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/attachments/20190408/b9637846/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2019-04-08_02-07.png
Type: image/png
Size: 80349 bytes
Desc: not available
URL: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/attachments/20190408/b9637846/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2019-04-08_02-14.png
Type: image/png
Size: 81311 bytes
Desc: not available
URL: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/attachments/20190408/b9637846/attachment-0005.png>