Title: HLT architecture
1HLT architecture
2TPC FEE
FEC (Front End Card) - 128 CHANNELS (CLOSE TO THE
READOUT PLANE)
DETECTOR
Power consumption lt 40 mW / channel
L1 5ms 200 Hz
8 CHIPS x 16 CH / CHIP
8 CHIPS x 16 CH / CHIP
drift region 88ms
L2 lt 100 ms 200 Hz
gating grid
PASA
ADC
RAM
anode wire
DDL (4096 CH / DDL)
570132 PADS
CUSTOM IC (CMOS 0.35mm)
pad plane
CUSTOM IC (CMOS 0.25mm )
CSA SEMI-GAUSS. SHAPER
1 MIP 4.8 fC S/N 30 1 DYNAMIC 30 MIP
- BASELINE CORR.
- TAIL CANCELL.
- ZERO SUPPR.
10 BIT lt 10 MHz
MULTI-EVENT MEMORY
GAIN 12 mV / fC FWHM 190 ns
3(No Transcript)
4TPC electronics ALICE TPCE READOUT CHIP (ALTRO)
Adaptive Baseline Correct. I
ADC
Tail Cancel.
Data Format.
Multi-Event Memory
Adaptive Baseline Correct. II
-
10- bit 20 MSPS
11- bit CA2 arithmetic
18- bit CA2 arithmetic
11- bit arithmetic
40-bit format
40-bit format
SAMPLING CLOCK 20 MHz
READOUT CLOCK 40 MHz
0.25 mm (ST) area64mm2
power29 mW / ch SEU protection
DIGITAL TAIL CANCELLATION PERFORMANCE
ADC counts
ADC counts
Time samples (170 ns)
5Data compression Entropy coder
Probability distribution of 8-bit TPC data
Results NA49 compressed event size
72 ALICE 65 (Arne Wiebalck,
diploma thesis, Heidelberg)
- Variable Length Coding
- short codes for long codes for
frequent values infrequent values
6TPC - RCU
7RCU design control flow
TTCrx
SIU controller
FEE bus controller
DDL command decoder
FEE SC
RCU resource priority manager
Huffman encoder
Slow control
Watch dog health agent
Debugger
PCI core
8RCU design - data flow
TTC controller
TTCrx registers
FEE bus controller
Event memory
SIU controller fifo
FEE bus controller
Event fragment pointer list
SIU
Huffman encoder
FEE bus controller
Configuration memory
Slow control
9Data compression TPC - RCU
- TPC front-end electronics system architecture and
readout controller unit.
- Pipelined Huffman Encoding Unit, implemented in a
Xilinx Virtex 50 chip
T. Jahnke, S. Schoessel and K. Sulimma, EDA
group, Department of Computer Science, University
of Frankfurt
10RCU prototypes
- Prototype I
- Commercial OEM-PCI board
- FEE-board test (ALTRO FEE bus)
- SIU integration
- Qtr 3, 2001 Qtr 2, 2002
- Prototype II
- Custom design
- All functional blocks
- PCB Qtr 2, 2002
- Implementation of basic functionality (FEE-board
-gt SIU) Qtr 2, 2002 - Implementation of essential functionalty Qtr 4,
2002 - Prototype III
- SRAM FPGA -gt masked version or Antifuse FPGA (if
needed) - RCU production
- Qtr 2, 2003
11RCU prototype I
- Commercial OEM-PCI board
- ALTERA FPGA APEX EP20K400
- SRAM 4 x 32k x 16bits
- PMC I/O connectors (178 pins)
- Buffered I/O (72 pins)
12RCU prototype I
FEE boards
trigger
- Implementation of basic test functionality
- FEE-board test (ALTRO FEE bus)
- SIU integration
FEE-bus daughter board
PMC
PCI bus
FPGA APEX20k400
PCI core
I/O
SIU card
internal SRAM
4 x 32k x 16
FLASH EEPROM
onboard SRAM
13RCU prototype II
- Implementation of essential functionality
- Custom design
- All functional blocks
SC
TTC
FEE-bus
PCI bus
SIU-CMC interface
PCI core
FPGA
SIU
internal SRAM
gt 2 MB
FLASH EEPROM
Memory D32
14RCU prototype II - schematics
CIA miscellaneous
JN2A
JN1
JN2
Flash
Power (1.8V Gen.)
Flash
Flash
JN3
JN4
JN5
APEX
Connectors
15RCU prototype II RCU mezzanine
16RCU prototype II - schematics
SIU / DIU mezzanine card (1/2 CMC)
CIA miscellaneous
JN2A
JN1
JN2
Flash
Power (1.8V Gen.)
Flash
Flash
JN4
JN3
JN5
APEX
Connectors
17Programming model
- Development version status December 2001
PCI-tools
PC LINUX RH7.1 (2.4.2)
RCU-API
device driver
PCI core mailbox memory
PLDA board
FEE bus controller
SIU controller
ALTRO emulator
FEE bus
SIU
ALTRO emulator
DDL
18SIU-RORC integration
RCU prototype I
LINUX/NT PLDA/PCI-tools RCU-API devicer driver
SIU
FPGA
interface
SIU controller
PCI core
SIU
SRAM
PCI bus
DDL
pRORC
LINUX DDL/PCI-tools pRORC-API device driver
DIU
PCI bridge
Glue logic
interface
DIU
PCI bus
19SIU-RORC integration
PC1 write memory block to FPGA internal SRAM
PC1 memory block
PC2 allocate bigphys area, init link pRORC
SIU controller wait for READY-TO-RECEIVE
RCU internal SRAM
PC2 send DDL-FEE command READY-TO-RECEIVE
SIU
SIU controller strobe data into SIU
DDL
pRORC copy data into bigphys area via DMA
DIU
PC2 bigphys memory area
20RCU system for TPC test 2002
Trigger
FEE-boards
FEE-bus
LINUX RH7.x DATE PLDA/PCI-tools RCU-API devicer
driver
SIU
FPGA
interface
RCU prototype II/I
FEE-bus controller SIU controller
Manager
PCI core
SIU
SRAM
FLASH
ext. SRAM
PCI bus
DDL
LINUX RH7.x DATE DDL/PCI-tools pRORC-API device
driver
DIU
PCI bridge
Glue logic
interface
DIU
pRORC
PCI bus
21Programming model
- TPC test version summer 2002
DATE
FEE configurator
PC LINUX RH7.1 (2.4.2)
PCI-tools
RCU-API
device driver
PCI core mailbox memory
Prototype II (Prototype I)
RCU resource priority manager
SIU controller
FEE bus controller
FEE bus
SIU
FEE boards
DDL
22TPC PCI-RORC
PCI bus
DIU
-
CMC
FPGA
Memory
PCI bridge
Glue logic
interface
Coprocessor
D32
³
internal
2 MB
DIU card
SRAM
2 MB
Memory D32
23HLT architecture overview
Optical
Links to Front
-
End
- Not a specialized computer, but a generic
large scale (gt500 node) multi processor cluster - A few nodes have additional hardware (PCI
RORC) - has to be operational in off-line mode also
- Use of commodity processors
- Use of commodity networks
- Reliability and fault tolerance is mandatory
- Use standard OS (Linux)
- Use of on-line disks as mass storage
Receiver
Processos / HLT Processor
RcvBd
RcvBd
RcvBd
RcvBd
RcvBd
RcvBd
RcvBd
RcvBd
PCI
PCI
PCI
PCI
PCI
PCI
PCI
PCI
NIC
NIC
NIC
NIC
NIC
NIC
NIC
NIC
Distributed
Farm Controller
HLT Network
PCI
PCI
NIC
NIC
PCI
NIC
NIC
Monitoring
Server
PCI
PCI
PCI
PCI
PCI
PCI
PCI
PCI
PCI
PCI
NIC
NIC
NIC
NIC
NIC
NIC
NIC
NIC
NIC
NIC
HLT
Processors
24HLT - Cluster Slow Control
- Features
- Battery Backed Completely independent of host
- Power Controller Remote powering of host
- Reset Controller Remote physical RESET
- PCI Bus perform PCI bus scans, identify devices
- Floppy/flash emulator create remotely defined
boot image - Keyboard driver remote keyboard emulation
- Mouse driver remote mouse emulation
- VGA replace graphics card
- price very low cost
- Functionality
- complete remote control of PC like terminal
server but already at BIOS level - intercept port 80 messages (even remotely
diagnose dead computer) - interoperate with remote server, providing
status/error information - watch dog functionality
- identify host and receive boot image for host
- RESET/Power maintenance
25HLT Networking (TPC only)
All data rates in kB/sec (readout not included
here)
92 000
spare
7 000
65 000
92 000
spare
180 links, 200 Hz
92 000
spare
65 000
7 000
92 000
17 000 000
aggregate
2 340 000
252 000
?
cluster finder 18036 nodes
Track segments 10836 nodes
Track merger 7236 nodes
Global L3 12 nodes
Assume 40 Hz coinzidence trigger plus 160 Hz TRD
pretrig with 4 sectors per trigger
26HLT Interfaces
- HLT is autonomous system with high reliability
standards (part of data path) - HLT has a number of operating modes
- on-line trigger
- off-line processor farm
- possibly combination of both
- very high input data rates (20 GB/sec)
- high internal networking requirements
- HLT front-end is first processing layer
- Goal same interface for data input, internal
data exchange and data output
HLT internal, input and output interface Publish/s
ubscribe
- When local do not move data Exchange
pointers only - Separate processes, multiple subscribers for
one publisher - Network API and architecture independent
- Fault tolerant (can loose node)
- Consider monitoring
- Standard within HLT and for input and output
- Demonstrated to work on both shared memory
paradigm and sockets - Very light weight
27HLT system structure
TRD trigger
Dimuon trigger
PHOS trigger
Trigger detectors
Level-1
Pattern Recognition
TPC fast cluster finder fast tracker Hough
transform cluster evaluator Kalman fitter
Dimuon arm tracking
Level-3
Extrapolate to ITS
...
Extrapolate to TOF
Extrapolate to TRD
(Sub)-event Reconstruction
28Preprocessing per sector
raw data, 10bit dynamic range, zero
suppressed Huffman encoding (and vector
quantization)
RCU
detector front-end electronics
Huffman decoding, unpacking, 10-to-8 bit
conversion
fast cluster finder simple unfolding, flagging
of overlapping clusters
RORC
fast track finder initialization (e.g. Hough
transform)
cluster list
fast vertex finder
Hough histograms Peakfinder
receiver node
global node
vertex position
raw data
29FPGA coprocessor cluster finder
- Fast cluster finder
- up to 32 padrows per RORC
- up to 141 pads/row and up to 512 timebins/pad
- internal RAM 2x512x8bit
- timing (in clock cycles, e.g. 5 nsec)1
- (cluster-timebins per pad) / 2 clusters
- outer padrow 150 nsec/pad, 21 ?sec/row
- centroid calculation pipelined array multiplier
1. Timing estimates by K. Sulimma, EDA group,
Department of Computer Science, University of
Frankfurt
30FPGA coprocessorHough transformation
- Fast track finder Hough transformations2
- (row,pad,time)-to-(2/R,?,?) transformation
- (n-pixel)-to-(circle-parameter) transformation
- feature extraction local peak finding in
parameter space
2. E.g. see Pattern Recognition Algorithms on
FPGAs and CPUs for the ATLAS LVL2 Trigger,
C. Hinkelbein et at., IEEE Trans. Nucl. Sci. 47
(2000) 362.
31Processing per sector
raw data, 8bit dynamic range, decoded and unpacked
vertex position, cluster list
slicing of padrow-pad-time space into sheets of
pseudo-rapidity, subdiving each sheet into
overlapping patches
RORC
sub-volumes in r,?,?
fast track finder B 1. Hough transformation
fast track finder A track follower
fast track finder B 2. Hough maxima finder, 3.
tracklett verification
track segments
receiver node
cluster deconvolution and fitting
updated vertex position updated cluster
list, track segment list