[libre-riscv-dev] buffered pipeline

Fri Apr 5 09:41:21 BST 2019

On Fri, Apr 5, 2019 at 7:12 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> So, I think we should make a shared git repo that has all the pipeline
> utility code separated out so that it can be used as a submodule for our
> other code. This allows us to easily synchronize our code between different
> repos. Alternatively, we should merge all our nmigen repos into a monorepo.
> In either case, the ieee754fpu repo is the wrong spot for it.

 yehyeh, it is.  i've just been throwing code in there continuously as
a convenience, trying to get (and keep) things working.

 any suggestions on a repo name and a top-level python module name?
personally i like _short_ python module names, to avoid the mess that
happens with multi-level imports.  "iomgr" as the top-level python
name, that sort of thing.  iomgr because, reallly, the pipeline API is
actually managing IO (or, data).

> I think we should have it (or at least an automatically updated mirror) on
> debian salsa because of the nicer interface for viewing code and because of
> the CI capabilities, so we could set it up to automatically run the unit
> tests each time we push code. Note that debian salsa currently doesn't have
> very much test runner infrastructure, so we shouldn't be running long
> tests, so that's why I haven't set up Kazan to automatically run unit tests
> since building LLVM takes quite a while.

 ok that's a good idea.  i prefer a mirror, and running the unit tests
automatically on salsa is good, too.

> Also, we need to decide on a subset of the pipeline API that is stable
> enough that we can start building significant quantities of code on it
> since an API that's constantly changing in incompatible ways doesn't help.

 singlepipe.py and multipipe.py i don't believe need significant
changes, or if they are changed it has a huge knock-on effect on the
unit tests (of which there are a quite a lot) and the FPADD code.

 pipeline.py is nowhere near ready being declared as useable, i
haven't even got unit tests in place for it yet.

> I think we should proceed with the explicit Stage classes since they are
> much easier to compose using StageGroup (like the *Pipeline classes that
> luke wrote, except it has the Stage interface and doesn't insert registers
> between stages) and since they are much less complicated than getting the
> user-friendly __getattr__/__setattr__ implementation to work.

 agreed.

> I'm fine with switching to overridable functions that define the input and
> output shapes and using the eq function that luke wrote but I think we need
> ready/valid to be passed through Stage instances to support stages that are
> entire pipelines or FSMs.

 take a look at the FPADD code.  trace it through from
test_fpadd_pipe.py.  it's *entirely* a multi-stage pipeline
arrangement (including with fan-in and fan-out) and *already passes
through ready/valid*.  it does so underneath the Stages (Data Handlers
/ Managers), without the Stages even knowing that that's what's going
on.

 supporting FSMs needs a bit more thought.  i believe it will be
possible to deploy the multi-fan-in and multi-fan-out pipeline muxers
to actually emulate FSM data routing (whilst, incredibly, still
maintaining data integrity!) however it'll be complicated as hell
underneath so i haven't fully investigated it yet.

> We can use CombStage (or similar) for boilerplate
> code for simple combinatorial stages.

 done already, jacob.  grep -r "StageChain" in the fpu code and you'll
see that it's already in use, and already works... and *does not need
ready/valid to be exposed".

 the main glue logic module is fpadd/pipeline.py which pulls
everything together.  that's where the 3 stage pipeline (actual
pipeline, not combinatorial chain) is created, and it is 3-stage even
though there is a fan-in and fan-out in there to make it a Reservation
Station / FunctionUnit, because the fan-in and fan-out are both
combinatorial.

 StageChains are used in FPNormToPack, FPAddSpecialCasesDeNorm and
FPAddAlignSingleAdd to combinatorially-chain together what _used_ to
be FSM states in the original jon dawson code.

 because StageChain presents the exact same Stage API as the Stages it
combinatorially-chains together, the StageChain may be dropped into
*either* a Buffered *or* Unbuffered Pipeline instance... and the
result is: a pipeline stage.

 the FP unit is much further along in infrastructure terms than you
may be expecting, jacob.  i'd like to morph fmul and fdiv FSMs to be
morphed ever so slightly so that they can be dropped into a "Stage"
(or, 3 Stages: one for get_ops, one for data handling, one for put_z).
that's why the data valid/ready is needed, because by isolating the
multi-clock FSM code behind the "Stage" API....

l.