Title: Some Thoughts on L1 Pixel Trigger
1Some Thoughts on L1 Pixel Trigger
- Wu, Jinyuan
- Fermilab
- April 2006
2Introduction
- Preference on detector layout and related pattern
recognition issues. - Experience from Fermilab BTeV.
- Triplet.
- Triplet finding.
- Tiny Triplet Finder.
- Data output rate of the readout chip.
- Triggering on low PT features.
3Preference on Detector Layout
4Preference on Detector Layout
- If N layers of pixel detector planes are
affordable, (in terms of material, cost, data
volume, power, cooling etc.), normally spaced
configurations like (b) is more preferable. - Pattern recognition for (b) is more difficult.
- From BTeV works, the pattern recognition for (b)
is not as hard as we thought several years ago.
(a)
(b)
5BTeV and CMS-SLHCPattern Recognition
6Simulated B event in BTeVSilicon Pixel Detector
7BTeV Level 1 Vertex Trigger-- Finding Triplets
and Then
30 station pixel detector
8 Triplets
- Triplet
- Data item with 2 free parameters.
- of measurements - of constraints 2.
- A triplet is not necessarily a straight track
segment. - A triplet may have more than 3 measurements.
- Circular track with known interaction point is a
triplet since it has 2 free parameters.
(Otherwise it has 3 parameters.)
9Triplet Finding
- Three layers of nested loops are needed if the
process is implemented in software. - A total of n3 combinations must be checked (e.g.
5x5x5125). - In FPGA, to unroll 2 layers of loops, large
silicon resource may be needed without careful
planning O(N2)
Plane A
Plane B
Plane C
for (i0 iltN_A i) for (j0 jltN_B j)
for (k0 kltN_C k)
10Triplet Finding
- Triplet finding can be done in software or in
firmware. - Tiny Triplet Finder (TTF) is a firmware
implementation developed in Fermilab BTeV. - Tiny small silicon usage.
- For more info on TTF, see handout.
Triplet Finding
O(n3) Software Processes
O(n) FPGA Firmware Functions
O(N2) Implementations Hough Trans., etc.
O(Nlog(N)) Implementation Tiny Triplet Finder
11Circular Tracks from Collision Pointon
Cylindrical Detectors
(F2-F3)64
(F1-F3)64
- For a given hit on layer 3, the coincident
between a layer 2 and a layer 1 hit satisfying
coincident map signifies a valid circular track. - A track segment has 2 free parameters, i.e., a
triplet. - The coincident map is invariant of rotation.
12Tiny Triplet FinderReuse Coincident Logic via
Shifting Hit Patterns
C3
C2
C1
One set of coincident logic is implemented.
For an arbitrary hit on C3, rotate, i.e., shift
the hit patterns for C1 and C2 to search for
coincidence.
13Tiny Triplet Finder for Circular Tracks
Shifter
Shifter
Bit-wise Coincident Logic
Bit Array
Bit Array
- Fill the C1 and C2 bit arrays. (n1 clock cycles)
- Loop over C3 hits, shift bit arrays and check for
coincidence. (n3 clock cycles)
R1/R3
R2/R3
Triplet Map Output To Decoder
14Readout Chip Issues
15Data Rate of Readout Chips
- SLHC
- 80MHz
- 4 hits/(1.28cm)2 (hits or clusters? cluster 2.5
hits from BTeV. So 4 hits 1.5 clusters) - 16-bit/hit
- 3.125 Gb/cm2/s, with 8b10b etc. 5 Gb/cm2/s
- FPIX2 readout chip at Fermilab BTeV
- Area 128x0.05mm x 22x0.4mm 0.56cm2.
- Output 6x140Mb/s 840Mb/s.
- 1.5 Gb/cm2/s.
- If we redesign FPIX
- 8x320Mb/s2.56Gb/s
- 4.57 Gb/cm2/s
16FPIX2 Readout Chip
Pixel Size 50mm x 400mm Columns 22 Rows
128 Outputs 1,2,4 or 6 Cu pairs _at_140MHz
17Core Organization
- Column-based architecture
- Three mutually-dependent parts
- Core Logic
- End-of-column Logic
- Pixel Cells
- Readout order
- Hit cell by hit cell in a column.
- Column by column.
- Not time ordered.
18Pixel Cell
19Output of FPIX2 for BTeV
b04
b03
b02
b01
b00
b09
b08
b07
b06
b05
b14
b13
b12
b11
b10
b15
b20
b19
b18
b17
b16
b23
b22
b21
Hit24
Row
Column
BCO(70)
ADC
1
- A hit is output using 24 bits, _at_140Mb/s per Cu
pair. - User protocol is used as shown (not 8B/10B).
- The BCO field takes 8 bits. (16 bits 24 bits)
- To eliminate or reduce number of bits taken by
the BCO, the chip has to be redesigned to output
time ordered data. Doable or not? It is
possible but not obvious now. FPIX1 was designed
with time ordered data, but was slow. Study is
needed.
20Trigger/DAQ System Model
21A Model of Trigger/DAQ System
Triplet Data
Readout Chip
Correlation Logic Module
L1
L1 Trigger Commands
Readout Chip
HLT/ DAQ
Readout Chip
Full Data
Readout Chip
10 m, Cu
100 m, fiber
Outsider of Steel?
- The readout chips send hit data to the
correlation logic module (CLM, Correlator/OptoTX
J. Jones) just outside detector via copper
links. - The CLM find triplets and send initial angle,
momentum of each triplet to L1. - L1 system issues trigger commands back.
- Readout chip send full data of selected BX to
HLT/DAQ via CLM.
22Output of the Readout Chip
10 m, Cu
Readout Chip
Correlation Logic Module
Readout Chip
Readout Chip
Readout Chip
- Data volume from the readout chips is large.
(Full rate 3.125 Gb/cm2/s) - Optionally, partial data can be sent to reduce
the bandwidth (about (1/5) 3.125 Gb/cm2/s) since
the CLM needs only - F coordinate with lower resolution (1/2)
- of a hit cluster (1/2.5).
- Study on readout chip re-design is needed.
b04
b03
b02
b01
b00
b09
b08
b07
b06
b05
b14
b13
b12
b11
b10
b15
b20
b19
b18
b17
b16
b23
b22
b21
Hit24
Row
Column
BCO(70)
ADC
1
23L1 Trigger Commands
Triplet Data
Readout Chip
Correlation Logic Module
L1
L1 Trigger Commands
Readout Chip
HLT/ DAQ
Readout Chip
Full Data
Readout Chip
10 m, Cu
100 m, fiber
Outsider of Steel?
- The CLM find triplets and send initial angle,
momentum of each triplet to L1 and L1 system
issues a multi-bit trigger commands back. - Readout chip send full data of selected BX to
HLT/DAQ via CLM. - The data volume of the selected BX is relatively
small. - Optionally, the correlation logic module can run
one or a few longer algorithms (L1.5?) when the
full data flow through. The HLT uses the results
when making L2 decisions. - So the multi-bit trigger command BX, L1.5
algorithm ID Dump data in 1234 and apply
algorithm ABCD.
24More Readout Chip Issues,Latency etc.
25Tiny Triplet Finder for Circular Tracks
Shifter
Shifter
Bit-wise Coincident Logic
Bit Array
Bit Array
- Fill the C1 and C2 bit arrays. (n1 clock cycles)
- Loop over C3 hits, shift bit arrays and check for
coincidence. (n3 clock cycles)
R1/R3
R2/R3
Triplet Map Output To Decoder
26Latency Budget Usagefor Triplet Finding Process
- CMS L1 decision time 6.4 ms, 2 x 0.5 ms of it
will be in cable delay. - Filling the C1 and C2 bit arrays takes n1 clock
cycles. - Looping over C3 hits, shifting bit arrays and
checking for coincidence take n3 clock cycles
pipeline stages (about 10). - Assume n1, n3 64, latency usage 64 64 10
138 clock cycles. - At 160 MHz (FPGA or ASIC) clock frequency, 138
clock cycles 138/(1606.4) 13 (of 6.4 ms CMS
L1 decision time). This is only an example, but
looks OK.
27A Closer View of Latency Budget
Triplet Data
Readout Chip
Correlation Logic Module
L1
L1 Trigger Commands
Readout Chip
Output Hits
Triplet Finding (1)
Triplet Finding (2)
Triplet Data Out
Cable (1)
L1
Cable (1)
Cable (2)
L1 Processes
- The readout chips send out hit data to the
Correlation Logic Module. - The triplet finding starts after receiving data
of the first hit. - After all hits are transmitted, phase 2 of
triplet finding (looping over C3 hits, shifting
bit arrays and checking for coincidence) runs. - Triplet data are sent out after first triplet is
found. - After cable delay, the L1 starts L1 processes
after receiving first triplet data. - After all triplet data are received, the L1
command is issued. - The L1 command is sent back and executed.
28Max of Hits/BX
Output Hits
Triplet Finding (1)
Triplet Finding (2)
Triplet Data Out
Cable (1)
L1
Cable (1)
Cable (2)
L1 Processes
- 4 hits/(1.28cm)2/BX is an average, in some BX,
the of hits may be many times larger. - The readout chip should drop some hits if the
of hits/BX is too big or the time to output hits
will be too long. (Note the 6.4 ms L1 latency.) - Consider 64, 128, 256 hits/(1.28cm)2/BX, i.e.,
x16, x32, x64 of average, the time to output the
hits takes 16, 32, 64 BX on link with throughput
match the average data rate. The output time
0.2 ms, 0.4 ms, 0.8 ms -- should be OK.
29Readout Chip Spec
- Should read out time ordered data.
- Should drop data gracefully if of hits/BX is
too big. - Should drop data gracefully if of hits/ several
BX (a short term average) is too big. - Should be able to output both brief data for
trigger and full data for readout. - Should store data on chip for 6.4 ms .
- ?
30Extra PossibilityTriggering on Low PT Features
31Triggering on Low PT Features?
- Many tracking algorithms degrade rapidly when
momentum of the track goes low. - Circular track triplet finding does not need high
PT assumption, so it does not degrade as rapidly. - The trigger system discussed is especially
suitable if one needs to trigger on low PT
features of the event. - In CMS 4T B field, all tracks look to be low PT.
32Example Finding Soft Jets
- A simulated event with 200 tracks.
- Flat distributions.
- Min. R 55 cm
- 88 soft tracks are added.
- They are grouped in 2 small initial angle
regions, i.e., 2 soft jets.
Can you see the soft jets?
Can you see the soft jets now?
Track Initial Angle Distributions
33Summary
34Summary
- With experience from Fermilab BTeV on triplet
finding, pattern recognition is not a problem.
One should feel free to choose preferred detector
layout. - Data output rates of the current readout chips
are close enough to the SLHC requirement. But
studies are needed. - Triggering on low PT features is possible. But
studies are needed.
35The EndThanks
36(No Transcript)
37(No Transcript)