[libre-riscv-dev] Design for test methodologies
whygee at f-cpu.org
whygee at f-cpu.org
Sun Jan 12 20:32:53 GMT 2020
Hello Staf, Jean-Paul & list,
On 2020-01-11 13:44, Staf Verhaegen wrote:
> When doing DFT for ASIC current tools rely on having flip-flops with
> scan chain support. That feature is then used to load test patterns in
> the design and test the chip.
Yes, I have studied this method at ASIME/LIP6 in 2001.
See the next part of that message for the criticism and proposed
solution to this approach.
> What I do think you propose is to not
> only use flip-flops with scan chain support but extend the
> functionality of the standard cells for DFT. Unfortunately I don't see
> how you can do this without introducing unwanted area overhead for
> ASICs.
You can't see how it's done efficiently because indeed it's obviously
not possible and, being the speed freak I am, it wouldn't even make
sense :-)
I do not propose to modify ASIC gates, on the contrary I start from
there.
(which explains why I want to have access to more gates libraries than
the one for the Actel designs). They will be implemented "as is" in the
final mask.
However I modify the gates during test & simulations and substitute them
with code with more features to enable coverage and verification (among
others).
This lets me check that each gate has been fully tested with given test
patterns,
or (hopefully one day) generate those test patterns.
> Given the timeline for the NLNet project DFT was not included and
> testing is planned to be done in the old way by just running test
> programs and see if they have the right output.
Fine, but how can you be sure that your program covers all the gates
and your faults-model ?
This is one of the problems that my library solves :
you can run the program on the *simulated* circuit (with mapped
and substituted gates) and you then get a histogram of how the gates
are used. This lets you tune the testing program to exercise a
particular circuit.
---oO0Oo---
Note on DFT methodology and tools :
My designs don't use "internal boundary scan" like LIP6 does because
they create more
problems than they solve, particularly if I want to test/try in a FPGA
(including increased stress on the clock network, altered timings,
increased size of
the design, and the scan chain can't run at the full clock speed of the
design,
unless at high prices for other circuits, among many issues I have
found).
My approach integrates the test circuits from the start of the design so
I can test
the tester along with the whole system, I can even create the FPGA
circuit
to test the finished design, mapped to another FPGA. This ensures that
I have confidence in the test rig when the chips arrive from fab.
Part of the problem with the internal boundary scan is the clock and
latch enable signals, with their huge fanout and very tight timing
constraints,
hence physical requirements that compete with the rest of the circuit
(and might decrease its overall performance).
My solution is to cut the problem in half and avoid high-constraint
signals.
Look at the end of this article, at part 6 :
https://connect.ed-diamond.com/GNU-Linux-Magazine/GLMF-218/Quelques-applications-des-Arbres-Binaires-a-Commande-Equilibree
https://connect.ed-diamond.com/sites/default/files/articles/gnu-linux-magazine/glmf-218/83759/img13_CircuitDebugSPI_large.jpg
- a first shift register sends data to key points of the circuit.
It can be made as a clock-only system, low-frequency, with rippling
clock,
to avoid stressing the main clock network, and the few serial pins
already
limit the bandwidth anyway. The gates are simple DFF and some MUX can
be
added if needed, controlled by nearby DFFs of the chain.
- data out are serialised by a large balanced multiplexer I have
designed.
My "balanced control binary tree" makes this system scalable beyond
64 bits of input.
Marie-Minerve has been made aware of this last year.
The whole system appears on the outside like a modified, half-duplex SPI
port
that I can test in pure VHDL and FPGA. It can be driven by a
microcontroller,
a RPi or whatever has a few reasonably fast GPIOs. And my gates library
lets
me verify the effect of any given test vector and the expected output
result.
The key to the design is where to place these elements but it's obvious
for
the architectures I design : hijacking the instruction register and/or
the memory ports allows full access to most of the critical resources,
because I can output specifically crafted, made-up instructions, on top
of controlling the internal state machine and other minimal circuit
behind
the curtain. With negligible impact on the DUT's size & speed.
---oO0Oo---
Here is another circuit test approach :
Due to capacitive effects, faults could occur at full speed and be
hidden at
low speed, and/or you want to "bin" your chips based on their max. speed
and/or
voltage and/or temperature and/or ...
So you want to see at which speed the circuit starts to malfunction.
You input a clock signal with frequency controlled by a test rig then...
You end up with a crazy huge database of test vectors to shove at full
speed
into the chip. It won't work easily.
You can generate the test vectors "onchip" with a LFSR and all the
outputs
get "mixed" with another "disturbed" LFSR to generate a serial
signature.
The output of the LFSR is output at full speed on one pin of the chip,
and compared to the expected result. A high-speed bitstream, probably
coming out of a 25Q128 SPI Flash chip, can be easily XORed and you can
run the test for 128 million cycles before the SPI chip loops back.
The properties of a LFSR ensure that if ONE input perturbation bit
(coming from the circuits to test) is NOT as expected, the rest of
the stream will be highly uncorrelated with the expected bitstream
(one half of the bits will be wrong in average).
It's cheap, fast, quite efficient, unintrusive, low-tech, rather
scalable,
automatable, uses few physical resources (a LFSR adds negligible
overhead
on the chip) and my library helps designing the LFSR by checking the
coverage,
for a given test duration. For example you can tune the width of the
LFSR,
its polynomial and its reset value, and run scripts overnight to compare
coverage over a given design space/range. Give it a try, GHDL is free
and you can run as many instance as you like on as many computers as you
like at a time !
The only thing is fails to do is pinpoint the reason of the fault
but it's not needed for go/nogo chip tests when they're still on the
wafer.
Diagnostic requires more sophisticated tests, which then help to
increase yield.
---oO0Oo---
I hope it helps !
- These methods have been published in the past. I know of no prior art.
- Before joining ASIME/LIP6 in 2001, I was a test engineer at
Mentor/Meta Systems and helped troubleshoot Celaro emulators.
I might not be the brightest of them all but I had 20 years to
think this through since :-)
> greets,
> Staf.
All the best !
yg
More information about the libre-riscv-dev
mailing list