Title: Tracker Timing Margin MRB
1Tracker Timing MarginMRB
- R.P. Johnson
- U.C. Santa Cruz
- October 15, 2004
2NCRs
- 107 failure of the strip data readout in the
burn-in system at 60C. - 114 same as 107.
- 118 same as 107.
- 139 same as 107, but some failures even at room
temperature. - 193 about 10 of the MCMs have poor timing
margins with respect to clock duty cycle.
31. The Issues
- A few MCMs sent back incorrect data on every
other event when operated at 60C (NCR 107). - Even more fail at 85C.
- When investigating that, we found that the same
error symptoms could be provoked in any MCM in
the burn-in system by - raising the frequency enough,
- lowering the voltage enough,
- raising the clock duty cycle enough,
- or some combination of these and temperature.
- Some MCMs have very low duty-cycle margins at 20
MHz when operated with burn-in cables or flight
cables. NCR 193 lists several that fail below
55 duty cycle (from an external clock source).
4Duty Cycle Margins
Results from testing 36 MCMs in the burn-in
station. No significant improvements were seen
using flight cables without connector savers. No
obvious pattern repeats from CC to CC or from one
set of 36 MCMs to another.
5Other Facts
- The errors always occur in every other event, and
always are associated with the same 1 of 2 GTRC
registers (or 2 of 4 GTFE registers). - The errors are never associated with bad parity
detections. - The errors are never associated with trigger tag
mismatches. - In a given set of conditions the problem
generally only occurs on one side of the readout
(left or right). - The error generally does not occur in the MCM
test station, except at much higher frequency. - Therefore
- The problem cannot be internal to the GTFE.
- It cannot be in the transmission from GTFE to
GTRC. - It cannot be in the transmission from GTRC to
TEM. - It must be internal to the GTRC, probably in the
reading and/or writing of one of the two internal
memory buffers. - It is also related somehow to the cables.
62. Additional Testing?
- Recently we started measuring the margins with
respect to duty cycle in each burn-in setup (36
MCMs at a time). - Up to now there is no documented requirement on
this margin, but I started placing into NCR 193
all MCMs that show any problems below 55 duty
cycle. - Probably it would be a good idea to add this test
to the LAT-TD-02367 procedure (environmental test
and burn-in). - (However, I recommend that we first reduce the
clock-bus termination resistance in the burn-in
cables from 100 ohms to 75 ohms.)
73. Hypothesis
- Suspected root cause
- The errors occur when a timing margin is exceeded
in the internal GTRC communication with one of
its two memory buffers. - The problem is exacerbated by the relatively poor
quality of the clock on the flex-circuit cables,
compared with the clean clock supplied over a
very short cable in the MCM test stand. - We can test this by
- Studying the clock signal on the cables.
- See later slides
- Studying the timing margin while varying
characteristics of the cables. - Our first attempt was to see if replacing the
burn-in cables by flight cables and removing the
connector savers would help. This did not
significantly improve the situation. - Second, we tried varying the termination resistor
on the cable. This had a significant effect!
8150?
The timing margin with respect to duty cycle
steadily improves with lowered termination
resistance. These measurements were made using a
flight cable (which is now nonflight) and no
connector savers. Just going from 100 ohms down
to 75 ohms will make all of our MCMs function up
to at least 55 duty cycle.
100?
75?
50?
35
69
9Scope Traces
50?
1 MHz
100?
Clock on the cable, measured at the connector of
the MCM closest to the TEM.
1020 MHz
50?
75?
100?
150?
Clock measurements made at position 0 on the
cable (the MCM connector nearest to the TEM). At
20 MHz the observed amplitude hardly depends on
the resistance.
1120 MHz
50?
75?
100?
150?
Clock measurements made at position 4 on the
cable.
1220 MHz
50?
75?
100?
150?
Clock measurements made at position 8 on the
cable (nearest to the termination).
1320 MHz, Cable Position 8
50?, 50
100?, 50
50?, 55
100?, 55
50?, 60
100?, 60
The duty cycle that we see on the cable is always
5 or 6 greater than what supply to the EGSE
system from the external Lecroy pulser.
14Clock on the MCM
50?, 55, Cable RC 8
50?, 55, MCM 8
100?, 55, Cable RC 8
100?, 55, MCM 8
The clock output from the GTRC can be seen at the
termination resistor on the internal MCM clock
bus. See the two plots on the right above. The
100-ohm cable termination appears to give a
slightly longer duty cycle on the MCM bus.
15Conclusion
- The EGSE stretches the duty cycle by about 5,
comparing what we see coming out of the CC versus
what we input into the VME crate. This eats into
the Tracker timing margin. - The first point of failure on the MCM when the
duty cycle or frequency gets too large is always
one of the two memory buffers in the GTRC. - There is a surprisingly large variation among the
GTRC chips in their sensitivity to this. - The problem is far worse when using the TEM plus
flex-circuit cables versus using the
test-interface-board of the MCM test stand. - Although it is hard to understand exactly why, an
unmistakable improvement in margin can be
obtained for all MCMs by lowering the clock
termination resistance on the cables.
164. Impact to Inventory
- If we solve this problem by simply discarding all
MCMs with low margin, we will lose 10 to 15 of
our inventory from this alone. Probably that
would require augmenting the production at
Teledyne.
175. Suggested Corrective Action
- Build Tower-A with cables as-is, but screen the
MCMs (already done) to remove those with low
margin. - Rework all cables already assembled to replace
the 100-ohm clock-bus termination resistor with a
75-ohm resistor (we already have this part in the
MCM production parts stock). - Introduce an ECO into the Parlex production asap,
so that cables get built with the 75-ohm
termination. - Make the same resistor change on the burn-in
system cables. - Add the duty-cycle screening to the burn-in
procedure (LAT-TD-02367). Retest all MCMs that
have already gone through burn-in without this
screening. - Reject MCMs that do not function over the full
range of 35 to 55 in duty cycle of the external
clock source (this is more like 40 to 60 duty
cycle on the Tracker cable).
18Impacts to Other Systems
- Request that the electronics group verify that in
the flight instrument the Tracker clock duty
cycle will fall well within our MCM test range. - Add this to the Tracker ICD.
Cost-Schedule Impact
- Resistor cost is negligible.
- Cost of the rework and Parlex ECO is unknown.
Probably in the low 4 figures. - The cable rework for at least 1 set needs to be
done by end of November for Tower-B. - The proposed new testing at SLAC can be
accomplished with existing labor and negligible
impact to the delivery schedule.
196. Effectiveness of the Corrective Action
- There is no change in the design performance
capability. - Based on our available data, we expect that this
change will recover all, or nearly all, of the
MCMs impacted by the subject NCRs. - Reliability of the system will be enhanced. The
margins will be improved on all MCMs,
significantly decreasing the likelihood of
finding timing problems in the integrated system.
7. Recommended Disposition
- A few of the subject MCMs have already been
dispositioned for EGSE work. - Retest all of the remaining subject MCMs after
the burn-in system and procedure have been
upgraded. Use for flight those that pass.