[libre-riscv-dev] Divider pipeline structure

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sat Feb 2 03:19:39 GMT 2019


---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Sat, Feb 2, 2019 at 1:19 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> I propose having a radix-4 pipelined div/rem/sqrt/rsqrt unit with 16 stages
> (plus a few for fp rounding and misc stuff) that is 64 bits wide and can be
> partitioned into 2x32, 4x16, and 8x8

 like it.  the partitioning is going to need to be a general
requirement for all ALUs (the 8/16 ALUs are for non-radix-aligned
"finishing of vectors").

> with the plan being that it can be
> shared between 2 or 4 cores

 we can't share betweeen cores: resource contention == spectre.  *sigh*.

> and would support 64-bit operations by passing
> through the pipeline twice?

 funny, that's how MIPS advocated a special variant of DIV that would
do 12-bit accuracy (good enough for 3D), and could be "finished off"
with a 2nd instruction to better accuracy.

> It would implement:
> - div/rem for i8, i16, i32, and i64
> - fdiv/fsqrt/frsqrt for f16, f32, and f64
> - maybe fmod/frem for f16, f32, and f64
>
> If needed, we could have sqrt/rsqrt be radix-2 and take 2 trips through the
> pipeline for fp32, 1 for fp16, and 4 for fp64.
>
> If shared between 4 cores, it would still have a 32-bit throughput of 1/2
> operation per clock per core, which is sufficient.

 given that routing tends to be more costly (gates and power-wise)
than the ALUs that data is routed to, and given that spectre is based
around resource contention, i'm inclined towards not having shared
ALUs.

 mitch alsup pointed out that the processors he designed, he always
made sure that every ALU had an adder, for example, and thus every FU
would have an adder, such that it did not matter which FU the data
went to (as far as ADD is concerned).  page 15-18 of the 2nd chapter
of his book covers the analysis of the percentage of instructions.

 basically, FU to ALU-function (ADD, MUL etc.) is not a one-to-one
relationship, it's a many-to-many relationship with duplicated ADDs,
duplicated MULs, and in this way (a) you don't get resource
bottlenecks and (b) the amount of data routing (which is extremely
costly) is reduced.

l.



More information about the libre-riscv-dev mailing list