Title: Data Reduction Schemes for MicroBoone
1Data Reduction Schemes for MicroBoone
2Summary
- Waveform digitization is a necessary readout
approach for TPC detectors but it creates large
volume of data. - It is necessary to reduce data volume without
losing useful information. - Accelerator neutrino data is compressed using
lossless Huffman Coding scheme, with a typical
(1/10) reduction ratio to save DAQ and data
storage cost. - Dynamic Decimation and Huffman coding are applied
to supernova data with a (1/60) to (1/100) total
reduction ratio so that Supernova data can be
taken within a reasonable equipment budget.
3Basis of Study
Wire Number
Collection
Induction 2
Induction 1
Drift Time
Data from BO detector of FNAL
This is not a simulation
- Hit waveforms in TPC carry useful information
(e.g. track angle etc.). - Digitizing the waveforms creates large volume of
data. - Data reduction without losing useful information
is necessary.
4Slow Variation of Raw Data
U(n1)
A
U(n1)-U(n)
A-B
DFF
B
D
Q
- More than 99 points differ from previous points
by -1, 0 or 1. - Huffman Coding can be applied to the differences
of the data points.
5The Huffman Coding
- The U(n1)-U(n) value with highest probability is
assigned to shortest code, i.e., single bit 1. - Values with lower probabilities are assigned with
longer codes, e.g., 01, 001, 0001 etc. - Huffman coded words and regular words are
distinguished by bit-15.
Regular ADC data for first point or when
U(n1)-U(n) is outside -3
0
0
ADC value (13-bit)
Huffman Coded
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
-1
0
0
0
1
2
Padding or Continue to Next Word
In this example, 6 differences of the data
samples are packed in the 16-bit data word.
6The Huffman Coding Block
- The block is able to operate at up to 250MHz
clock in Altera Cyclone III FPGA devices. - The block uses 245 logic cells, taking 0.6 in an
EP3C40F484C6 device (129) containing 39600 logic
cells.
7The Compress Ratio of Huffman Coding
N
N/(10.7)
- On typical TPC events a compression ratio of
about 10 can be achieved. - Compression ratio is sensitive to high frequency
noise.
8Dynamic Decimation (DD)
- Only small time intervals, i.e., region of
interest (ROI) must be sampled at high rate. - Most time intervals can be sampled with lower
rate, without losing useful information.
9Dynamic Decimation Block
N/(10)
N
All data
Supernova Data
- The two blocks are able to operate at up to
250MHz clock. - The Dynamic Decimation in our case reduces data
by a factor of 10. - The supernova data will go through two
compression stages.
10The Data Paths
External Memory
Accelerator Neutrino Events
Serial to Parallel Conversion
16MHz to 2MHz Decimation
Data Merging RAM
Huffman Coding
Output Interface
Serial to Parallel Conversion
16MHz to 2MHz Decimation
ADC
Dynamic Decimation
Huffman Coding
Serial to Parallel Conversion
16MHz to 2MHz Decimation
Supernova Data
Serial to Parallel Conversion
16MHz to 2MHz Decimation
11Any Differences ?
Raw
With Dynamic Decimation
12Are you sure?
- If the data is very noisy, can your Huffman
Coding block still compress data by factor of 10? - This is a valid question.
- We have experienced this problem in our study.
- Solution Filtering, Filtering, Filtering. (See
backup slides)
Nov 4-5-6 2009
CD-1 Readiness Directors Review
12
13The Knobs of Data Volume Control
External Memory
Accelerator Neutrino Events
Serial to Parallel Conversion
16MHz to 2MHz Decimation
Data Merging RAM
Huffman Coding
Output Interface
Serial to Parallel Conversion
16MHz to 2MHz Decimation
ADC
Dynamic Decimation
Huffman Coding
Serial to Parallel Conversion
16MHz to 2MHz Decimation
Supernova Data
Serial to Parallel Conversion
16MHz to 2MHz Decimation
Total Compress Ratio 60 - 100 from BO events.
- The filtering schemes and parameters in the
Dynamic Decimation block are knobs for data
volume control. - Most of analog noises can be filtered out.
14Internal Review Recommendation
- Care should be taken in stating the conclusions
of the data reduction study. The Huffman coding
compression ratio of 101 was achieved on the
data coming from a different detector. That
detector had its own set of the baseline noise
and event frequency. It should be shown that the
same compression ratio can be achieved with the
MicroBoone TPC, which in turn might give a
feedback for the analog front-end, and digitizer
specifications.
Nov 4-5-6 2009
CD-1 Readiness Directors Review
14
15Our Response
- We have studied the FPGA implementation and
compiled and simulated a decimation block. - The decimation block is to be integrated into the
BO detector DAQ hardware/firmware. The work is
in progress. - The firmware permits us to further study noise
performance of the data compression schemes.
Nov 4-5-6 2009
CD-1 Readiness Directors Review
15
16The End
17A Mystery of Dynamic Decimation Huffman Coding
N
N/10.6
Dynamic Decimation
N/60
N
Dynamic Decimation
Huffman Coding
N
N/10.7
Huffman Coding
- Dynamic Decimation reduces number of samples by
factor of 10. - Huffman Coding reduces number of bits from raw
data by factor of 10. - When cascaded, the combination reduces number of
bits by factor of 60.
18Huffman Coding Ratios for Dynamic Decimation
- The Huffman Coding compress ratio improves as the
filter in Dynamic Decimation improves.
19A Mystery of Huffman Coding Ratios on Down
Sampled Data
N
N/(10.7)
(N/5)
(N/5)/(7.5)
- The 5MHz data is down sampled to 1MHz.
- The Huffman Coding compress ratio drops from 10.7
to 7.5 when the data is down sampled.
20Averaging in Decimation A Re-discovery
Nyquist Frequency lt (1/2) Sampling Frequency
- Simple down-sampling is not good.
- When the decimation factor is D, an averaging
over D samples is good either. - An averaging over 2D samples is necessary.
- There is still aliasing with averaging over 2D
samples but it is less severe than averaging over
D samples.
21Weighted Average, The CIC-2 Filter
- Filter performance can be further improved with
weighted average over 4D samples. - The filter is called Cascade-Integrate-Comb
filter of order 2 (CIC-2). - The CIC-1 filter is the moving average.
22Huffman Coding Ratios for 5MHz to 1MHz
- The Huffman Coding compress ratio improves as the
filter in Dynamic Decimation improves.