[libre-riscv-dev] [Bug 154] Cell for Dependency Matrices is needed

Tue Jan 14 01:53:23 GMT 2020

On Tuesday, January 14, 2020, <whygee at f-cpu.org> wrote

BTW it is detailed there :
> https://hackaday.io/project/8449-hackaday-ttlers/log/148234-
> direct-coupled-transistor-logic

nice.

> and asked Fairchild
>>> to make a specific transistor, still sold today : the gold-doped 2N2369
>>>
>> ooo so if we reeeally wanted to, we could actually reconstruct a 6600 from
>> its original transistors.
>>
>
> which would be a desperate move since the Living Computer Museum @ Seattle
> restores a couple of units (lower-performance but still Cray-zy 6400 and
> 6500)
> https://www.youtube.com/watch?v=4Zt03YsMyW4

niiice.  ok so not necessary then.

>
> But I don't think you'll want to deal with the water chiller on the roof
> and the 400Hz motor-generator.
> l

/me silent for several seconds
*screams*
/me silent again

> But it's true that the strong restrictions on what was possible back then
> made these guys push in every direction and forced them to make the
> smartest design they could.

exactly.  the gatecount on the DMs (although one of them is called Q Table)
is just ridiculously low.

and yet it contains everything needed for OoO, register naming, everything
that *nobody noticed including David Patterson* which makes a full-blown
superscalar OoO architecture.

> Mitch gave you a bit of insight

a bit? he did way more than a bit.  i am humbled and amazed at the insights
he provided.

he spent *six* months teaching me how to do multi-issue OoO architectures.

but, bear in mind, his experience does not just cover the 6600: he started
from that, then designed the Motorola 68000, then the 88100, then went to
work for AMD and designed the K5 and the Opteron Series (from around 2002),
then worked as an advisor on AMDGPU (i think) and finally and more
recently, designed Samsung's GPU Texturisation opcodes and their
Transcendental FP unit.

 and I'm not surprised that many people
> reinvent the wheel (I did).

 or use Tomasulo. and associated awful power consumption due to the use of
CAMs.

Old designs are a well of inspiration and design lessons

tell me about it.

I have difficulty "getting" what you all write about in detail, so I hope
> someday somebody will make sketches, diagrams, plans etc. to "show" what
> you have in mind :-)

i did... they're on comp.arch and in the crowdsupply git repo on
git.libre-riscv.org.  i also did some youtube videos.

however most of these are me "learning", they're not a "summary", you know
what i mean.

one of the things that is slightly irksome is:

* mitch's book chapters on the 6600 and associated improvements we have to
respect his wishes on the distribution conditions of his copyright
material. he's not charging money, he simply requests acknowledgement. i am
absolutely fine with that, i just wish i could put them online

* the improvements are done *after* translating from DCTL to IEEE logic
gates, and we found, after four months of back and forth, that actually one
of them has bugs.

yes, whoops.

* i was unable to fully and precisely replicate Thornton or Mitch's work,
in a modern context.  i had to, in some places use combinatorial SRLatches
and in others use *register* based (sync) SRLatches. i.e. put a DFF after
the SRLatch.

the reason for this was because i really did not grasp entirely the
paradoxical complexity-simplicity of DMs, so experimented radically and
*literally* randomly (repeat until success) and, by sheer bloody minded
persistence of *literally* going through all possible boolean logic
combinations, eventually hit nmigen statements that gave the right answer
without locking up in a combinatorial loop.

so... asking me to explain that? well, i _was_ able to do so, to Sam, six
months ago.  now? it will take me 2 months to get back into the code, and i
can then have a go.

oh btw one amazing thing that came out of working with Mitch: re-reading
the DCTL logic diagrams, he noticed for the first time in 40 years that the
double precision FPU is a 2 stage pipeline.

previous readings led everyone to believe that the 6600 has no pipelines:
only a revolving door of 3 SRNOR latches between GOREAD, EXECUTE, and
GOWRITE (only 2 of those may be true at any one time) and consequently this
is fundamentally why it is logically and categorically impossible for the
6600 DMs to get into that "unstable" (unknown) state.

however this was a new discovery, the existence of register latches half
way through the DP FPU circuits, indicating that it us indeed a genuine 2
stage pipeline.

the rest of the circuits use "timing" latches - circuits that transition at
the same rate as the computation being executed combinatorially.  thus when
the timer latch goes "ping" the data must be ready.

thank you yg.
>>
> sorry for the distraction ;-)

it's not.  always welcome these insights and discussions.

l.

-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68