Title: Level2 Interface Board Status
1 - Level-2 Interface Board Status
- David Saltzberg for L2 Group
- Level-Two Trigger Review
- December 7, 2001
2Overview
- Phase 1 L1interface, Clist, XTRPlist, SVTlist
- Phase 2 ISOlist, RECES
- Phase 3 Muon board (not in this talk)
- (Phase 1 and Phase 2 have been done in
parallel)
3Responsible Physicists
- L1 interface Greg Feild
- Clist Monica Tecchio, Heather Ray
- XTRPlist, SVTlist Matt Worcester , Jane
Nachtman, D.S. - RECES Masa Tanaka , Karen Byrum
- ISOlist Steve Kuhlmann , Bob Blair
- lives within 50 miles of Fermilab
- ANL engineers (L1,reces,isolist) John Dawson,
Bill Haberichter - Special operatives Stephen Miller, Ted Liu,
Peter Wittich
4Theory of Operation - I
- Input data from Clients
- L1 interface, RECES
- one word/event, no handshake
- Clist, XTRPlist, SVTlist, Isolist
- variable length data, buffered by FIFOs
- terminated by EE word
- Some info transfer about BC, L2B or event count
for sync checking
5Theory of Operation - II
- Output Control and Data signals via Magic Bus
- Master mode (currently all boards except Reces)
- L2P issues STARTLOAD
- When ready, Interface board requests Boss
- Board is granted Boss from upstream
- Board drives block mode data-transfer on Bus
- Boss is released by interface board, and MOD_DONE
asserted - When all MOD_DONE bits set, L2P begins processing
- Slave mode
- Board is addressed over Magic Bus and read in
single-word transfers - Alternate Output (TRKlist boards)
- VME readout
6General Error Detecton Handling
- In L2P (every event--10s kHz)
- L2P has 600 ?sec timeout for all MOD_DONE signals
- BC, L2B, or counters checked where possible event
by event - Checks for exactly 1 magic bus word from L1
board - If error, pull CDF_ERROR (or equivalent) and ask
for automatic Halt-Recover-Run to resynch FIFOs. - In TrigMon ( 2 Hz)
- Check Number of words transferred for each board
- Check BC across system
- Exact bit-for-bit comparison of data vs.
emulation and/or alternate source - Offline
- Run select parts of TrigMon Monicas validation
code on look area, stream-g, stream-b, l2-torture
runs (1M events lately)
7Testing Performance of system
- Without Beam
- L2 torture nominally runs at 20 kHz
- Occasionally have run system at 40 kHz
- Runs system with high L2B occupancy
- Test patterns in for COT tracks, SVT tracks,
emulate clusters - 9 interface boards up to 3 alphas connected
- With Beam
- Same config.
- Get real XFT tracks but often have to run SVT
test patterns (no SVX) - Have found other problems (sometimes systemwide)
that tests w/o beam do not show. (Real world
stuff that no teststand will anticipate) - Extensive tests before Oct. shutdown,
preliminary Dec. tests.
8Current Boss Arb. Kludge
- Glitch on BOSSGROUT (pecl) when taking BOSS can
lead to two boards taking boss. Since in
hardware (not firmware), cannot make simple
glitch protection
- Solution
- Reduce collision rate by putting different delays
in boards receiving of STARTLOAD (limits
deadtimeless L1A rate at 20kHz--we should have
such problems.) - Handle remaining collisions with L2P error
handling - New Backplane
- In a pinch, could it be fixed with TTL
9Overview Plot of L2 crate
10Board -by-Board Status(follows...)
- Status of best board
- Highest rate tested error rate
- Limit on (or measurement of) bit error rate
- Cooperation with other boards
- Plans for further work
- Status of spares
- Number and status of spares
- known problems?
- Status of Documentation
- Debugging tools, here and elsewhere
- Plans
- Other comments
11L1 Interface Board
- L2 torture tests
- tested at 20-40 kHz no problems
- tested 1M events, no errors tested offline
- no collisions with other boards (by construction)
- Known problems
- noisier than others, but protected in time
- still have to connect ground sheild see
- Solving noise here may solve it elsewhere
12L1 Interface Board Plots
No errors in bit-for-bit comparison
13L1 Interface Spares Debugging tools
- Spares
- S/N 1 OK
- S/N 2 OK (in crate)
- S/N 3 3/4 stuffed
- Debugging tools
- Bit for bit check available offline
- If more or less than one word is sent, L2P pulls
error - (Pretty simple board, no need for complex
diagnostics) - Teststand Can set bit patterns, check in
realtime or later - data source FRED
- data sink MB to emulator board
14L1 Interface Documentation/Plans
- DOCS
- CDFNOTE 4971
- Webpage
- http//hepwww.physics.yale.edu/www
_info/yale_cdf/l1crate.html - Schematics have control room hardcopy
- PDF files recently sent to Greg-- will put on web
and in trigger room - Plans
- Keep running
- Finish stuffing board 3 (2nd spare) and test
- Look into noise problem, not urgent. Wait until
after new MB installed
15CList Board
- Responsibles Monica Tecchio, Heather Ray
- Gets data by fiber from each Locos board
- L2 torture tests
- works at 20-40 kHz no errors
- no errors found in 1 M events offline
- Known problems
- crate 04-- had bit 02 is stuck low (probably
trivial)
16Clist board plots
- No errors in bit-for-bit comparisons
17L2 cutting on Jets
18Clist Debugging tools
- Bit-for-bit comparisons done in online/offline
monitoring - If L2 buffer number disagrees L2P pulls error
- Clusters can be set
- pulling cable in DCAS crate makes a known cluster
- in principle software exists to make arbitrary
cluster pattern at B0 (need to verify) - Michigan teststand capabilities
- Standalone board tests using VME
- Data source Locos
- Data sink MB L2P
- Test full clustering chain DCAS ---gt L2P via MB
w/ tracer generating multiple L1As
19Clist Spares/Documentation/Plans
- Spares
- S/N 1 OK (in system)
- S/N 2 flaky VME, otherwise works.
- S/N 3 being stuffed
- Documentation
- webpage for aces, experts non-experts
- http//www-cdf.fnal.gov/internal/cdfoperations/tri
gger/level2/my.html - will become general L2 webpage (need more disk
space) - schematics online in Michigan
- hardcopies in trigger room
- Plans
- Keep running stably with board 1, monitor
robustness - Fix flaky VME on board 2
- Make board 3 a second hot spare
20SVTlist Board Tests
- Responsibles Jane Nachtman Matt Worcester, D.
Saltzberg - L2 Torture Testing
- 20-40 kHz L1A no errors (SVX off, running SVT
test pattern) - Tested with 1 M events no bit errors
- Special run with checks inside alpha BERlt10-6
- No collisions with other boards
- Problems
- Gets confused if no EE word from SVT L2P pulls
error. - Due to SVX not sending info to SVX
- Known problems in SVX have been fixed, others?
- Bill A. thinking about an SVT timeout to pull
error - Only happens with beam. Checked (painfully)
before shutdown it worked (could even have
taken special oct. SVT runs with it.) - No firmware changes to TRACKlist boards in last 2
months!
21Some SVTList Plots
- No errors in bit-for-bit comparisons
22L2 SVT Cutting (before shutdown)
23XTRPlist Board Tests
- Responsibles Jane Nachtman Matt Worcester, D.
Saltzberg - L2 Torture Testing
- 20-40 kHz L1A noerrors
- Tested with 1 M events no detectable errors
- XTRD bank has known errors that cause Ntracks
mismatch - Correct at L2, wrong in readout
- No errors when cut on Ntrack agreement
- Handscan of other events looks okay
- No collisions with other boards
- Problems
- Illinois to fix XTRD bank filling errors
- One bad pT bit from one XTRP board
24XTRPlist plots
No errors in bit-for-bit comparisons when number
of tracks agrees.
25Spares for TRACKlist
- SVTList XTRPlist are both instances of one
board TRACKlist - CPLD change with JTAG connector
- one jumper change
- Six production TRACKlist boards
- Currently 2 in L2P crate--permanent
- Currently 2 in SVT crate --1 or both temporary?
- one makes nominal SVTD bank. Convenient for
booking SVT crate for test runs - having separate boards effectively makes a cable
check - another board in SVT crate makes XTRP
list---could be removed soon? - Six production boards, at least 2 required in
system, maybe 3. Right now using 4.
26TRACKlist spares
- S/N 1 2 (Prototypes, no longer used.)
- S/N 3 XTRPlist OK (in L2P crate)
- S/N 4 SVTlist OK -- used for SVTD bank
- S/N 5 XTRPlist OK --hot spare
- S/N 6 SVTlist MB not working, bad
connection - S/N 7 SVTlist stuck chisq bit for MB --
used for SVTD bank - S/N 8 SVTlist OK (in L2P crate)
- All boards work for VME readout
27TRACKlist debugging tools
- Can send arbitrary pattern from SVT easily
- Can send arbitrary pattern from XTRP (more
difficult) - Bit-by-bit checking in TrigMon
- Can test BC from XTRP SVT on every event
- UCLA teststand
- data source merger board
- data sink MB and emulator board and/or VME
28TRACKlist plans
- Keep running stably
- Fix one SVT spare (bad connection makes MB error)
- Fix one bad bit on another SVT spare
- Wean SVT off of second SVT board
- Make sure all six boards are hot spares
- Print hardcopies of schematics firmware
29TRACKList Documentation
- Web-pages
- Specs
- http//buggs.physics.ucla.edu/nachtman/bo
ard/specifications_v1.ps - TIB instructions
- http//www-b0.fnal.gov8000/level2/tib/ti
b_main.html - TIB database
- http//www-b0.fnal.gov8000/level2/tib
/tib_status.html - TIB schematics etc
- http//buggs.physics.ucla.edu/nachtman/ti
b.html - Schematics on web in .eps format
- Need updated hardcopies printed out
30ISOlist status
- Responsibles Steve Kuhlmann, Bob Blair
- Calculates 5 isolation sums
- DCAS-gtIso Pick --gtISOlist
- Clique -gtIsoclique-gt ISOlist
- L2 Torture tests (or cosmics)
- need to require eta-phi match (1-3 failure)
- perfect at 20-40 kHz in all 5 sums
- Problems
- with collisions see eta-phi match (still 1-3
failure), but L2P can check and pass the event - In 0.5 of events also scatter of expected vs.
seen in all 5 sums (less than analog jitter in
Run 1) N.B. the whole scatter comes from crate
1, eta17.
31ISOlist plots
32ISOlist spares
- In DCAS crates
- Need 1 ISOclique (have 2)
- Need 6 isopicks (have 8, 1 with stuck bit)
- In L2P crate
- Need 1 ISOlist (have 2)
- All spares are hot spares except for 1 isopick
with stuck bit.
33ISOlist Debugging Tools
- Standard running
- ISOpick times out if DCAS does not send data
- Standalone code
- writes to ISOclique (only board with VME) a seed
- tell it to read out fixed values to ISOlation
system - can load different values for different buffer
numbers - with a switch, can read energies from DCAS.
Essentially this factors the problem. - TrigMon Offline Code
- Incorporated isolation variables into Monicas
code - Need to debug some boundary values against the
hardware - Teststand at ANL
- data source ISOpick
- data sink MB to emulator board
34ISOlist Documentation/Plans
- DOCS
- CDFnote 5788
- Schematics in hardcopy in binders at ANL but will
come to trigger room - PDF files of schematics (firmware hardware) are
available, will be placed on web by Heather - Plans
- Continue running monitor robustness
- Go after eta/phi mismatch (needs coordination
between ANL and Michigan) - Find fix flaky bit in DCAS crate
35RECES status
- Responsibles Masa Tanaka, Karen Byrum
- Four boards in L2P crate receive information from
SMXR by fiber - During L2 Torture tests (36 kHz)
- In crate, on backplane, but not used by default
table - No negative interactions
- Special L2 executable (TEST_RECES table)
- L1 input is crossing trigger and 4 GeV elec, 8
GeV photon - runs at 20kHz L1 input, 100 Hz L2A
- Maybe small bit errors -- few thousand events
- All SMXR to RECES is okay (at end of shutdown)
- Problems
- Accidental collisions on Alpha readout
- Solns Arnds special retry readout code.
Stephen will modify FPGA - possible bit errors (10-3)
36Reces Plots
37RECES Spares/Docs/Plans
- Need 4 Reces boards in system
- 4 in top crate OK
- 2 spare boards OK
- Docs
- CDF 5132
- Need to put schematics on web hardcopies in
trigger room. - Plans
- Keep RECES on backplane during default running
- Fix readout problem
- Search for BER lt 10-4 in standard datataking
fix
38Reces Debugging tools
- Special standalone code
- VME based. Set trigger threshold, load SMXRs
- Send bit patterns to RECES board, Alpha reads
through VME - Check bit-for-bit (checks all bits)
- 10 Hz (tens of thousands of events OK)
- ANL teststand
- Not needed any more
- TrigMon plots
- temperature plots
- checks bit-for-bit errors
39Interface Board status by run(documented for
collaboration)
40Interface BoardsThe Bottom Line
- L2 crate with Clist, XTRPlist, SVTlist, L1
interface, ISOlist all work at up to full speed
20 kHz as-is. - Their bit-error rates are measured lt 10-6 (RECES
not tested to this level yet.) - Essentially all documentation exists. Some
tweaks in progress - There is at least one working spare for every
board. - Every board has a real expert living close by
- Work in progress fixing up extra boards bad bits
etc. - In current configuration we can fulfill the
charge of running jets, electrons and SVT at 5e31
right now, as-is (assuming all clients are
working)---backups will only distract.
41Goals of Sept. workshop(for interface boards)
- sync errors lt10-6 DONE
- cut on jets/ reliable Clist DONE
- reliable L1 board DONE
- automated HRR DONE
- solve XTRP problem DONE (dont remember what
is was, but it works) - reliable SVTlist DONE
- SVT kludge path DONE
- alpha code for cutting on SVT
- Simple code DONE, complete
cdf4718-lite underway - Solve clist eta/phi errors for electrons
- DONE for electrons
(iso needs work) - alpha electron code Debugging
- prepare firmware without delays for MB testing
DONE - test boards on new MB NOT DONE
- test isolist and reces DONE
- improve documentation DONE -- more to do, as
always
42Suggestions-I
- Spares should not be kept in lower crate unless
being used. Otherwise water leak (it has
happened before!) will destroy all boards.
Currently squatting on other spare space...could
use space allocated specifically for L2 spares - Need more disk space for L2 webpages on B0
machine. - SVT group should use XTRP list in TL2D and free
up spare TRACKlist board - Clients should be kept in stable configuration
- D-sized plotter in B0 for printing updated
Firmware schematics (.eps or .pdf)
43Suggestions-II
- Need more of the good jumpers (white)
- Make MagicBus document a CDFNOTE
- File cabinet for all L2 docs. Can be different
sized schematics and also text documents so
folders would work better than one binder. - web clearing house for all L2 web documents.
Good documentation exists for all boards, just
need a list of links (Heather is working on
this.) I think we should not over-structure
this at this point...leave the microstructure to
the individual groups - When given choice of testing kludge path vs. real
path, try real first
44Suggestions -III
- In next 3-6 months, experts (and their
supervisors) should think about training their
successors. - Need to implement bit-for-bit emulation SIXD--gt
TL2D into TrigMon - Need someone to write/ implement XFLD--gtXTRD
emulation - A MB display module would be a critical
debugging tool (LEDs on each line) much like the
old Fastbus display module