W'Skulski University of Rochester, July102008 1

About This Presentation

Title:

W'Skulski University of Rochester, July102008 1

Description:

The 1st smoking gun: galactic rotation is too fast. ... Fast serial link fast trigger decision, but similar ... Practically unlimited number of analog inputs. ... – PowerPoint PPT presentation

Number of Views:61

Avg rating:3.0/5.0

Slides: 75

Provided by: WojtekS9

Learn more at: https://www.pas.rochester.edu

Category:

more less

Transcript and Presenter's Notes

Title: W'Skulski University of Rochester, July102008 1

1
The Universe and the FPGA Digital Pulse
Processing for Dark Matter Search
Wojtek Skulski University of Rochester
2
Contributors

Wojtek Skulski (University of Rochester and
SkuTek Instrumentation).
Frank Wolfs (University of Rochester).
Eryk Druszkiewicz (University of Rochester).
Large Underground Xenon Detector Collaboration
(LUX).

3
Outline

Introduction Dark Matter Search.
Pulses from the Dark Matter LnXe detector.
Overview of signal sampling and Digital Pulse
Processing.
Why do we need a trigger?
Overview of a few trigger architectures.
Self-triggered digital DAQ becomes Digital
Trigger.
Present status of LUX Digital Trigger.
Conclusion.

4
Digital Pulse Processing for Dark Matter Search

Outputs from the Dark Matter LnXe detector are
digitized using flash A/D converters with 12 or
14 bit resolution _at_ up to 100 MSPS.
Pulses due to radiation and/or DM events are
detected in real time.
Waveforms containing events of interest are
recorded for analysis.
Electronics has to provide the following
Low-noise analog front end which receives the
signals from the phototubes.
Flash A/D converters, 12 or 14 bit _at_ up to 100
MSPS.
Field-programmable gate arrays which detect the
pulses.
Data readout from the FPGA and archiving for
offline analysis.
GUI for diagnostic data display and control of
the experiment.
Rochester group is responsible for designing the
LUX Trigger System.

5
History of digitizer development by the author

2002. The first single-channel digitizer DDC-1
12 bits _at_ 48 MSPS.
2003. The first 8-channel digitizer DDC-8 10
bits _at_ 40 MSPS.
Originally developed for PHOBOS experiment
(advanced trigger). Unfortunately, PHOBOS was
discontinued before the DDC-8 could be used.
2004. DDC-8/XLM a minor modification of DDC-8.
2006. DDC-8 Pro 12/14 bits _at_ 64 MSPS.
Developed for student labs. Became the first
iteration of the LUXcore trigger a single,
8-channel board serving 8 phototubes.
2007. DDC-8 LUX 12/14 bits _at_ up to 80 MSPS
System Connector.
More than 8 phototubes need to be served ?
multiple boards are needed. System Connector was
developed to link together multiple DPP boards.
The first version of the Event Builder was
developed as the receiving end for System
Connectors.
2008. DDC-8 DSP 12/14 bits _at_ up to 125 MSPS and
a large FPGA.
All previous digitizers used flat-pack FPGAs,
whose capacity was exceed by our DSP algorithms.
A transition to the BGA package was made
(finally).
2008/2009. A new version of the Event Builder
with a BGA FPGA is also planned.

6
A few facts concerning Dark Matter Search
7
The biggest mystery where is almost Everything?

Most of the Universe is missing from the books
Ordinary matter accounts for only 5 of the
Universe.

We are here
Source Connecting Quarks with the Cosmos, The
National Academies Press, p.86.
8
The 1st smoking gun galactic rotation is too
fast.

Gravitational pull reveals more matter than we
can see.

Rotation curve of the Andromeda galaxy.
Orbital velocity.
Observation.
Prediction based on visible matter.
Distance from the center.
Source Connecting Quarks with the Cosmos, The
National Academies Press, p.87.
9
The 2nd smoking gun large-scale gravitational
lensing.

Light from distant sources is deflected by
clusters of galaxies.
Visible mass cannot account for the observed
lensing pattern.
Reconstructed mass distribution shows mass
between galaxies.

Reconstructed mass distribution.
Observed lensing.
Source Connecting Quarks with the Cosmos, The
National Academies Press, p.89.
10
What is the Dark Matter composed of?

Nobody knows, but there are candidates predicted
by the theory
Axions light particles that may explain CP
violation.
Neutralinos heavy particles predicted by SUSY.
The neutralino is neutral, weakly interacting,
and as massive as an atom of gold.
Very rarely it will bounce off an ordinary
nucleus and produce some ionization.
Our experiment will attempt to detect neutralinos
deep underground where the background from cosmic
rays is very low.
We will use a two-phase liquid xenon (LNXe)
detector named Large Underground Xenon detector
(LUX).

11
Detectors for Dark Matter Search
12
Underground low-background laboratory
Cosmic particles stopped by 1 km of rock.
Dark Matter particles penetrate freely.
NB LUX will be at DUSEL, but here I am showing
Boulby.
13
LUX detector prototype
14
LUX detector consists of many channels
144 phototubes
Gas Xe
Each phototube requires an independent ADC and
the data-processing channel
Liquid Xe
15
The principle of 2-phase xenon detector
Gas inlet
HV
HV
gas
1.5 cm
Grids
liquid
S2
2.5 cm
S1
S1 scintillation in liquid Xe. S2
electroluminescence in gas Xe.
Quartz PMT
Figure from J.T.White, Dark Matter 2002.
http//www.physics.ucla.edu/hep/DarkMatter/dmtalks
.htm
Figure from T.J.Sumner et. al.,
http//astro.ic.ac.uk/Research/ Gal_DM_Search/rep
ort.html
16
Explanation of signal from a 2-phase xenon
detector
Primary pulse from the liquid Xe.
Electron drift time in LnXe.
Amplified pulse from the gas Xe.
Time flows this way
Figure from T.J.Sumner et. al.,
http//astro.ic.ac.uk/Research/Gal_DM_Search/repo
rt.html
17
Signal processing in a single channel LnXe
detector
Primary scintillation in liquid phase.
Secondary scintillation in gas
phase (electroluminescence).

Extract the areas under S1, S2, and the
separation time between the S1 and S2.
Time-stamp the data in order to correlate pulses
in different channels.

Figure from T.J.Sumner et. al.,
http//astro.ic.ac.uk/Research/Gal_DM_Search/repo
rt.html
18
Multi-channel signal processing

Each phototube is connected to an independent
ADC and the data-processing channel, which
extracts S1, S2, and the time interval between S1
and S2.
The distribution of light among the PMTs tells
where the interaction happened within the volume.
Channels are not independent. They are
correlated. In addition to processing individual
channels, the correlated inter-channel processing
is also necessary.
The data acquisition system has a fairly
advanced architecture explained in the next
section.

144 phototubes
19
Electronics for Dark Matter Search
20
Digital Pulse Processing (DPP)

The pulse-processing electronics can be either
traditional analog, or digital. The latter has
advantages over the former higher integration,
more flexibility, and lower cost. It also has a
slight disadvantage it needs to be programmed.
Outputs from the Dark Matter LnXe detector are
digitized using flash A/D converters with 12 or
14 bits _at_ several tens MSPS (e.g., 64 or 100
MSPS).
Pulses are detected in real time ? Digital Pulse
Processing has to be implemented.
Electronics has to provide the following
Low-noise analog front end which receives the
signals from the phototubes.
Flash A/D converters, 12 or 14 bits _at_ several
tens of MSPS.
Field-programmable gate arrays which detect the
pulses with DPP algorithms.
Data readout from the FPGA and archiving for
offline analysis.
GUI for diagnostic data display and control of
the experiment.

21
Functional diagram of a single DPP channel
DPP Digital Pulse Processing
Gain and offset control
Sampling clock
Analog input stage
Nyquist filter
ADC
Sample rate processor
Event rate processor
Analog signal input
Pulse information output
Waveform memory
Optional external trigger in/out
Trigger
Individual channel trigger output
analog
digital
22
Functional diagram of a multichannel DPP board
One DPP board
Board-level event processor (Formatting, compress
ion, etc.)
Single channel
Digital interface readout, monitoring, and setup
ADC
Analog
Single channel
ADC
Analog
To event builder
Single channel
ADC
Analog
Single channel
ADC
Analog
Board-wide trigger logic
Slow control
To slow control
From trigger subsystem
23
Functional diagram of a multiboard DPP system
Event-builder
DPP board
DPP board
Signals from detectors
DPP board
Recording
DPP board
DPP board
Subset of signals from detectors
Trigger system
Slow control monitor
Network
24
Which data is interesting?
Useful data
Not useful
Baseline, not useful
Time flows this way

Select useful data (so-called events) and reject
baseline data.
Typical rejection ratio is larger than 1000.

Figure from T.J.Sumner et. al.,
http//astro.ic.ac.uk/Research/Gal_DM_Search/repo
rt.html
25
Estimated rates of interesting data samples
Assumptions 8 channels, 14-bit _at_ 64 MHz 200 ms
ADC trace per event (contains only the
interesting samples) 1000 events per second,
and 200 ms fully recorded per event The period of
200 ms covers the electron drift time in LUX
detector
112 megabytes / second per channel
22.4 kilobytes / second per channel
448 kilobytes / second per DPP board
Board-level event processor (Formatting, compress
ion, etc.)
Single channel
Digital interface readout, monitoring, and setup
ADC
Analog
To event builder
Single channel
ADC
Analog
Single channel
ADC
Analog
Single channel
ADC
Analog
8 channels
One DPP board
26
Why is an FPGA necessary?
Sampling clock 64 MHz
ADC 14 bit
Sample rate processor
Event rate processor
Waveform memory
Trigger
112 megabytes / second per channel Times 8
channels ? 896 megabyte / second (each
board) FPGA is the only device which can
continously process such data rates.
27
Why is trigger necessary?
Unrestricted data rate 896 megabyte / second
(each board) Such data rates can be neither
managed nor recorded.
Pre-selected data rate becomes manageable and
can be recorded.
Event-builder
DPP board
DPP board
Signals from detectors
DPP board
Recording
DPP board
DPP board
Subset of signals from detectors
Trigger system
Trigger subsystem pre-selects only good data
to be recorded.
28
Limitations of the backplane architecture and
the solution point-to point fast serial links
29
Limitations of the VME backplane readout

Assume 200 msec of waveform memory per channel.
14-bit ADC means 15 bits (because of the ADC
overflow bit).
For simplicity lets say 1 sample 16 bits.
Full waveform is 12,800 _at_ 64 MSPS, or 20,000 _at_
100 MSPS (one ADC channel).
Trigger data means (pulse area, pulse width, time
stamp) four 16-bit words 8 bytes.
If using VME interface, then two types of data
transfer cycle are available
MBLT transfer, 4 bytes (32 bits) per transfer ?
40 MB/s 40 bytes / ms
2eVME transfer, 8 bytes (64 bits) per transfer ?
80 MB/s 80 bytes / ms

Full event data (full waveforms) 20,000 16-bit
words per channel
Trigger data 4 16-bit
Trigger data ? event builder. 144 channels 8
bytes 1152 bytes 29 ms (MBLT). Waveforms ?
event builder. 144 channels 40 kbytes 5.760
Mbytes 144 ms (MBLT). It means that the VME
system can read only 7 full events per second,
using 32-bit MBLT protocol, or 14 events per
second using the 2eVME protocol. The limitations
of the PCI readout will be roughly similar .
30
Implications of the backplane performance estimate
The VME system can read no more than 7 full
events per second, using the 32-bit MBLT transfer
_at_ 40 MB/s. (No more than 14 full events/s using
the 2eVME protocol _at_ 80 MB/s). Moreover, these
rates will be achieved at 100 deadtime, what is
not a good situation. What can we do? 1. Use the
trigger to pre-scale the event rate to only 7
interesting events per second. 2. Do not read
full waveforms. Read only the pulses, and skip
the quiescent baseline. 3. Compress the waveforms
to reduce the transfer rate. 4. Use faster
transfer rate. Ad 1. Now we see, why the trigger
is of such importance in this project. Ad 2.
Baseline suppression seems unavoidable in this
situation. Ad 3. Real-time compression requires
appropriate FPGA resources. Ad 4. A
point-to-point serial data link offers MUCH
HIGHER data transfer rate than VME.
31
Backplane switching currents may induce noise
An additional consideration is digital noise
which may be injected into the sensitive analog
inputs by large switching currents inherent in
the single-ended backplane such as VME or PCI.
Both are old-style single-ended bus interfaces
involving large transient currents. In such
high-current environment it may not be possible
to attain millivolt noise level. It means, that
during the measurement the VME interface has to
be kept inactive. The measurement cannot be
restarted until the VME readout is over. The VME
readout period is defining the dead time of the
DAQ system. Vendors of commercial VME or PCI
digitizers claim that the above does not happen.
However, in an experiment which is pushing the
detection limits towards low-amplitude signals,
it is prudent to verify whether or not digital
switching currents are indeed harmless. A radical
solution is to use low-noise standard such as
USB-2 or HDMI, which employ Low-Voltage
Differential Signalling (LVDS). Only 3.5 mA per
link is being switched in a differential mode,
which avoids inducing noise in nearby circuits.
32
Fast serial link ? fast trigger decision, but
similar event readout

Do not use a backplane such as VME or PCI. Use
differential point-to-point data links.
HDMI four differential pairs carry up to four
DDR data streams from each DPP board to the event
builder. Very high data rate can be pushed
through a short HDMI cable.
Flat-pack FPGA packages limit the signaling rate
to 200 Mbps. (Implies limitation 200 Mbits/s 4
links 100 Mbytes/s per cable.)
BGA packages allow the signaling rate of 622
Mbps. (Implies limitation 622 Mbits/s 4 links
311 Mbytes/s per cable.)
100 MB/s from each board means a huge improvement
over a backplane readout, because all links
operate in parallel.
The most dramatic improvement is when
transferring short trigger packets from the DPP
boards to the Event Builder, because the trigger
information can fit into the on-chip FPGA memory
at the receiving end. The trigger readout rate is
dramatically improved.
There is not enough memory in the receiving Event
Builder FPGA to accept the entire waveforms from
all DPP boards. They need to be transferred out
at the USB-2 rate, which is similar to the
backplane rate. Therefore, the full event readout
rate is not improved.

33
Bottlenecks of the digital pulse processing system
34
Three bottlenecks of the digital recording system

A typical digital system has three bottlenecks,
which have to be tackled in system design.
Raw Data Bottleneck.
The rate at which a digital system can produce
raw data is staggering. For example, a modest
system of 100 channels, 14 bits _at_ 100 MSPS
produces 17.5 gigabytes per second. No matter,
how large are on-board waveforms memories, they
will quickly overflow with raw data. A digital
system must either provide a) method to offload
the waveforms memories at the same rate, at which
they are being filled, or b) it must limit the
rate of data production to a more manageable
level.
Data Recording Bottleneck.
Data must be recorded. A typical recording medium
(a disk or a tape) can take roughly 100 MB/s. A
disparity between 1 and 2 is roughly 103.
Analysis Bottleneck.
Data must be analyzed. If we record at 100 MB/s,
then we are recording 1 GB every ten seconds.
Analysis can take 10x more CPU effort than
recording (optimistically). Therefore, improving
the recording rate (bottleneck 2) will
exacerbate the analysis glut.

35
How to remedy the three bottlenecks?
Some digital DAQ systems take only occasional
records of data, followed by long periods of
inactivity (e.g., a digital oscilloscope which is
recording occasional pulses). In such cases, the
average data rate is low, and there is no
problem. However, in some cases the DAQ has to
work full time, all the time. In such cases we
have to tackle the aforementioned
bottlenecks. Raw Data Bottleneck can be addressed
with real-time data reduction (i.e., digital
pulse processing) which extracts only the
relevant pulse characteristics, such as
amplitude, duration, and pulse shape parameters.
Rather than storing the full waveform, we can
store only a few numbers. The reduction factor
can be 10x or more. Even larger reduction can be
achieved, when pulses are infrequent, separated
by long stretches of uninteresting baseline. The
baseline does not have to be recorded at
all. Data Recording Bottleneck cannot be
improved, because any improvement in this area
will impact the next bottleneck. (Whatever is
recorded, has to be analysed.) Analysis
Bottleneck can be eased if the pulse data have
already been preprocessed. Real-time pulse
processing is the key to designing a large
digital DAQ system.
36
What is needed to implement real-time pulse
processing?
Real-time pulse processing is the key to
achieving the data reduction necessary in a large
digital DAQ system. There are three key
ingredients of real-time digital pulse
processing. Adequate raw processing power needs
be present. One has to employ FPGAs on the
digitizer boards because only the FPGAs provide
parallel operation on parallel sample streams.
Both FPGAs and digital signal processors (DSPs)
may be used on the downstream boards, depending
on requirements of a particular DAQ system.
Pulse processing algorithms have to be
implemented in the FPGA fabric. The FPGA
implementation of even simple algorithms (e.g.,
pulse detection) can be tricky. It requires
expertise, which is not always available in
physics community. Confidence, that the digital
pulse processing is working and it is safe.
Physicists are very reluctant when it comes to
discarding the data. Data reduction does lead to
discarding the data, and therefore it is viewed
with suspicion. From the safety point of view,
the prefered solution is to record every sample,
and then to analyze the recorded data from disk.
Such a preference leads to bottlenecks mentioned
on the previous slide.
37
Can we trust real-time pulse processing?

Physicists are very reluctant to discard the
data. Digital pulse processing leads to
discarding the data. It is therefore a valid
question, whether real-time digital pulse
processing is a good idea?
Pulse processing needs to be adequately
developed, understood, and tested in order to be
reliable.
The critical portion of waveforms can be recorded
in order to perform offline crosschecks with the
DPP results.
In the good-old-days the analog pulse processing
was performed all the time. We did not worry,
that by doing so we were discarding good data,
because there were no good data to be recorded
other than the output from our shaping
amplifiers. Nowadays we need to realize, that
shaping amplifiers could as well be named
real-time analog filters. Such analog filters are
neither better nor worse than the digital
filters. If we used to trust the former, we
should also accept the latter. Whether the filter
is digital or analog, it will be equally
trustworthy, when it is properly designed.
Digital pulse processing is necessary. It needs
to be accepted after it is understood.

38
How is trigger implemented?
39
What is trigger doing?
Interesting events Background events Noise
From detector
Strobe signal to DAQ
Trigger system
Trigger subsystem pre-selects all interesting
events, a representative sample of background
events, and also some noise. A strobe signal (a
pulse) is sent to DAQ to trigger its
operation, such that the accepted data can be
digitized and recorded.
40
How is trigger implemented?

Trigger is such an important topic, that we need
to look at the implementation details of the
trigger.
Three different ways of implementing the trigger
Analog computer built from off-the-shelf NIM1)
modules.
Small DPP system processing a subset of the
detector signals.
The main DPP system also performs the triggering
function (i.e., the DAQ system is
self-triggered).
1) Nuclear Instruments and Methods modules (NIM)
are shown later.

41
How is trigger connected to the DAQ?
The role of the trigger is to pre-select
interesting events.
DAQ
Event-builder
DPP board
DPP board
Signals from detectors
DPP board
Recording
DPP board
DPP board
Subset of signals from detectors
Trigger system
Strobe signal (a pulse) to the DAQ.
42
Trigger 1 analog computer built from NIM modules
Analog signals processed in analog domain
Five crates full of NIM modules
Very many LEMO cables
43
Pros and cons of NIM implementation

Advantages
Knowledge about programmable logic is not
needed.
Very little training required to work on a NIM
system. (Every physicist knows how to use an
oscilloscope and how to adjust trimming
potentiometers.)
Practically unlimited number of analog inputs.
(However, system complexity grows very quickly
with the number of analog inputs, as shown in the
previous slide.)
Disadvantages
Tedious to document. The only documentation are
oscilloscope screen shots and a written record in
a logbook.
Very difficult to reproduce the settings.
Flaky operation because of many knobs, switches,
and connectors.
Tuning cannot be performed remotely. continued...

44
The main advantages of NIM implementation
Why is the antiquated analog approach still being
used?

Very little learning is required to operate the
equipment.
Signals in NIM electronics can be examined
anytime with oscilloscope. No planning is
required concerning signal diagnostics.
Very wide bandwidth and fast rise time of pulses
(approx. 1 ns).
Timing is adjusted in analog domain. According to
common wisdom, analog means infinitely fine time
adjustments between pulses. (But in practice the
fine time adjustment is very tricky.)
Timing can be measured and adjusted to a few tens
of picoseconds.

45
Trigger 2 a single Digital Pulse Processor
DDC-8 Pro (LUXcore trigger system)
Analog signals digitized and then processed in
digital domain
FPGA Spartan3-400 performs digital processing in
real time
8 analog signals from phototubes
NIM IN, 2 lines from optional NIM analog computer
NIM OUT, 2 lines
The NIM OUT pulse is sent to DAQ system
46
Pros and cons of a single DPP solution

Advantages
Tuning can be performed remotely.
Trigger can be documented in full by saving all
the settings and configuration files.
Settings can be reproduced exactly by reloading
configuration files.
Mechanical knobs, switches, and connectors are
eliminated ? reliable operation.
Disadvantages
Only a few out of many phototubes can
participate in the trigger decision.
Knowledge about programmable logic is needed to
develop or modify any DPP system. Very few
physicists have working knowledge about
programmable logic.

47
Trigger 3 a few Digital Pulse Processors
This architecture will be used for the LUX trigger
A few front-end boards
A single Level-2 decision board
Decision pulse sent to DAQ system
Fast uni-directional data links
48
Pros and cons of the several DPP solution

Advantages
(A few advantages similar to the previous case
of a single DPP board.)
Much larger number of inputs.
Disadvantages
(A few disadvantages similar to the previous
case of a single DPP board.)
This architecture is in fact the same as the
DAQ architecture. Plainly speaking, this is a
small and less powerful DAQ system. Why invest
time and effort in trigger-only solution? Why not
design and build a COMPLETE self-triggered DAQ
instead? Why bother having two PARTIAL systems
rather than one COMPLETE system?

49
Trigger 4 self-triggered COMPLETE DAQ system
Front-end DPP boards
Level-2 event builders
Level-3 event builder
Recording
Fast bidirectional data links. (Bidirectional in
order to send the trigger decision back to the
front-end boards.)
50
Self-triggered DAQ system how it works.
Front-end DPP boards
Level-2 event builders
Level-3 event builder
Recording
Full event data (full waveforms)
Trigger data
1. All front-end boards send trigger data
downstream to the central event builder. 2. The
event builder makes the trigger decision (accept,
reject) and sends it back. 3. If the decision is
accept, then the front-end boards send the
waveforms. The communication is carried over very
fast, point-to-point, bidirectional data links.
51
Pros and cons of the self-triggered DAQ system

Advantages
Merging the DAQ with the trigger will avoid
duplication of effort.
All phototubes can participate in trigger
decision.
Disadvantages
New boards need to be developed (done in 2008!).
Commercial boards do not provide sufficient level
of system integration to build such a system.
This architecture is quite advanced. If DPP
looks like black magic, then the new advanced
trigger/DAQ system will look even more like black
magic.

52
Inter-board Data Link (System Connector)
53
The role of the System Connector
Recording
Fast bidirectional data links with gigabit/s
performance.
The point-to-point data links are dedicated. Each
front-end board can offload its data with
gigabit/s performance without waiting for other
boards.

Each DPP board has its own link to the back-end
Event Builder.
The link provides all the facilities for system
integration.
Therefore, the link will be named System
Connector.

54
Physical standard of the System Connector
The Fast Data Link cable and connectors have to
be properly designed for fast data transfer. A
home-made ribbon cable does not guarantee
sufficient signal integrity. The video-oriented
HDMI cable provides good signal integrity and it
is relatively cheap.
HDMI Type A or Type C connector/cable 19 pins
total 7 single-ended, 4 differential pairs, 4
ground pins 4 differential pairs (3 data
clock), nominally 340 MHz per pair (category 2
HDMI cable)
One meter cable 13, PCB socket 5.61
USB for comparison
55
System Connector wire diagram (nominal)
The HDMI wiring table will be used as guidance.
In our application the wires will be used
differently because we are transmitting arbitrary
digital data rather than video data. Most
important we are using the connector/cable
compliant to its electrical standard. Therefore,
signal integrity is guaranteed.
7 wires for slow control signals
1
DPP board ? Event Builder 4 differential
pairs 340 MHz per pair (nominal)
2
3
4
56
Fast data links and RocketIO

The newest high-performance Virtex5 is equipped
with several RocketIO transceivers, which provide
multi-gigabit signaling rates per one
differential pair. It is therefore a valid
question, whether or not RocketIO should be used
to implement the Fast Data Link?
I believe it is not a good idea to use RocketIO
in this case. RocketIO requires special stripline
design techniques, as well as connectors of
higher performance than HDMI. Rather than using
RocketIO, gigabit rates can be achieved by
several (up to four in case of HDMI) differential
pairs with lower signaling rate, as well as
inexpensive HDTV-grade connectors.
Low-cost Spartan-3 series FPGAs do not support
RocketIO.
I may consider RocketIO in the future versions of
my boards.

57
System clocking and time stamping
58
ADC clocking

In order to correlate the waveforms, all ADCs
must be clocked off a single clock source. The
clock signal is distributed to all DPP boards and
connected to ADCs and to the FPGAs.
The clock must be high quality stable frequency
and low jitter, as required by the ADC specs.
The simplest solution the ADC clock is
distributed at its actual frequency (e.g., 64 MHz
or 100 MHz). A more complex solution is to
multiply the clock. In case it is multiplied
on-board, then the multiplication circuit must be
of high quality (e.g., a phase-locked loop). The
Digital Clock Managers provided by the FPGAs may
not meet the ADC specs.
The ADC clock is best distributed using its own
shielded cable in order to avoid noise pickup
from adjacent signals. Backplane clocks are to be
avoided.
Hopefully, clock quality will be good enough when
it is distributed with LEMO cables and NIM
fan-in/fan-out. This needs to be tested.
In case the single-ended clock quality is not
sufficient, differential circuits must be used.
Mini-USB is a possible candidate differential
cable/connector (used in a non-standard way).

59
Time stamping

In addition to the ADC clock, a separate
time-stamp clock may be used in the system. This
clock provides a universal time base for the
entire system, especially in those cases, where
some DPP boards operate at a different ADC
frequency from other boards. (E.g., commercial
DPP boards from various vendors are used in
addition to our DPP boards.)
The time-stamping clock needs not be of very high
quality. Backplane clocks or wire bundles are OK.
The System Connector is also OK to distribute
this clock.
The time stamping clock can be distributed at a
lower frequency and multiplied on-board, using
the Digital Clock Manager (DCM) provided by the
FPGAs. The DCM timing jitter will not impact
time-stamping. (The DCM should not be used for
ADC clocking, or used only with caution and after
verifying that it meets the ADC specs).
A logic signal reset time stamp is used to
synchronously start time stamping counters in all
DPP boards. This signal may be distributed over
the System Connector, or it can use its own
cable. Effectively this signal assumes the role
of start a new run.
? Two signals are sufficient to implement time
stamping the clock and the reset.

60
Present status of Rochester electronics for Dark
Matter Search
61
The first DPP board DDC-8 Pro (limited to 8
phototubes) Developed in 2006 and well tested.
ADC, 8 channels, 12 or 14 bits _at_ 64 MHz ? 768 or
896 megabytes/s
JTAG
FPGA Spartan3-400 400,000 gates
Analog signal IN 8 channels, /- 1 Volt
FX2 USB-2, control and data readout
NIM IN, 2 lines
NIM OUT, 2 lines
Spy channel out 64 MHz 12 bits
USB connector
RS-232
Diag LED display
62

DDC-8 LUX prototype manufactured in 2007 and used
now

FX2 USB-2
FPGA Spartan3E 500,000 gates
Power
ADC up to 80 MHz
RS-232
JTAG
USB connector
Active Filters
HDMI connector
Passive Filters
Power
CLK
8 channel inputs
2NIM in used for time-stamping
Pulse out (spy channel)
2NIM in 2NIM out
63

DDC-8 DSP prototype manufactured in July 2008

FX2 USB-2
FPGA Spartan3A DSP 3.4M gates
Power
RS-232
ADC up to 125 MHz
JTAG
USB connector
Active Filters
HDMI connector
Passive Filters
CLK
8 channel inputs
2NIM in used for time-stamping
pulse out (spy channel)
2NIM in 2NIM out
63
64
The Trigger Event Builder
The Event Builder is a supervisor board
developed for the LUX trigger. Revision 0 was
manufactured in December 2007.
NB the photo shows an unpopulated PCB. The
actual board was manufactured and tested, but I
did not take a photo.
Clock IN 1 NIM IN 5 NIM OUT 4
Spartan3E 500,000 gates
Eight System Connectors connected to DDCs Up to
100 MB/s per link
FX2 USB-2
USB 2.0 up to 40 MB/s
65
Present status of the Rochester Digital Trigger
(Jul/2008)

The digitizer is finalized 8 channels, 14 bits _at_
up to 125 MSPS, FPGA with 3.4M gates, USB-2, and
a fast System Connector transferring data to the
downstream Event Builder.
Minor corrections of the DDC-8 DSP Rev. 0 are
possible. Major changes are not planned.
Noise performance of the previous editions of
DDC-8 was excellent. I expect similar great
performance from the new DDC-8 DSP.
The testing and porting firmware will start in
July 2008 and continue through Fall 2008.
The Revision 0 of the Event Builder (EB) is
finished and tested.
This revision of the EB still uses a small FPGA
with 0.5M gates.
LUX trigger firmware most likely will fit into
this FPGA.
Firmware development and system integration will
continue through Fall/2008.
A new EB version with a large FPGA is planned.
The urgency of this board depends on the
performance and limitations of the Rev. 0 Event
Builder.

66
Software and firmware for the self-triggered
DAQ system
67
Approaching system integration

Previous slides showed the components of the
self-triggered DAQ system the boards and the
history of their development. As of July/2008 the
components are mostly finished.
The digitizer DDC-8 DSP is finalized. Testing
will be done in Summer and Fall 2008. After the
tests I plan to make as many digitizers as
needed, subject to available funding.
The Event Builder Rev. 0 would benefit from minor
layout improvements. The present prototype EB
board will be used in Fall 2008 to develop the
LUX trigger system. We will need a few EB spares,
and therefore I plan to implement minor
corrections to the EB Rev. 0 before making a few
copies of it.
On a longer time scale, a more powerful EB will
be designed (with a larger FPGA). The details of
the new EB will be decided after the first
version of LUX trigger is working.
The system integration consists of cabling (a
minor issue) and of software development.
Single-board software is well developed.
Multi-board software is in planning stages.
The digitizer FW/SW is well advanced (waveform
capture, readout, etc.). It is being used.
The EB firmware/software development has recently
started.

68
Software
Data readout, control panels, the GUI, spectrum
displays, etc, are all extremely well advanced
over many years of digitizer development (since
2002). As an example, I am showing a screenshot
from circa 2004. This software is a very robust
basis for further development. We can show many
more such sleek pictures.
69
Digitizer development system
This photo shows the digitizer development system
in Spring 2004. Since then significant progress
was made in hardware, software, and firmware.
70
Digitizer performance
71
Noise performance
There are many aspects of this project which I am
not going to even mention in this last section,
such as digital filter design and their
sensitivity to pulses of various shapes, trigger
latencies, or readout speed. All this is ongoing
and well advanced, but also very detailed and to
a large extent specific to the LUX experiment. In
this last section I want to show the noise
performance of my designs, which turns out to be
excellent. It should be obvious to an informed
reader, that low noise is the key to anything
else. The noise is affecting trigger sensitivity,
its ability to detect small pulses, as well as
energy resolution in spectroscopic
applications. After seven years of development I
have built the device, whose performance is
adequate for serious scientific work. The work is
continuing towards applications of the digitizer
in one of the most demanding experiments. Many
applications are possible in addition to LUX. The
DDC is a general-purpose digitizer equipped with
a powerful on-board pulse processing engine,
which can execute a variety of application-specifi
c algorithms.
72
Low noise ? low threshold
I designed the analog part of my digitizers in
2003 and I have been using it with very few
modifications since then. Its noise performance
is excellent. As an example, I am showing g-ray
histograms collected with a 1 by 1 NaI(Tl)
detector with the original 8-channel DDC-8. A
very low threshold 5keV will be very important
for LUX.
73
Low noise of the waveforms ? high sensitivity
The waveform was collected with a low-noise
signal source connected to the 12-bit version of
the digitizer DDC-8 Pro. The noise in this
waveform is dominated by the flash A/D chip, and
therefore it can hardly be any better. Very low
sample-to-sample noise seen in these data will be
very important for pulse detection in LUX.
RMS noise 0.214 mV
74
Summary

Digital Pulse Processing was explained.
Several trigger architectures were presented .
A novel fully Digital Trigger is composed of a
few digitizer boards and one Event Builder. This
architecture will be used for LUX trigger.
The same approach can be used to build a complete
self-triggered DAQ, which can be expanded to
hundreds of channels by cascading Event Builders.
Such a DAQ does not need a separate trigger. It
can trigger itself by quickly transferring short
trigger packets to one central location, which
is then issuing a system-wide trigger.
DDC-8 LUX board was manufactured in December
2007. Parameters 12/14 bits _at_ up to 80 MSPS,
FPGA with 0.5 million gates, external clocking
and time stamping, several NIM in and out
connectors, and a System Connector. It is being
used now to develop the Digital Trigger.
DDC-8 DSP board was manufactured in July 2008.
Similar to the LUX board, but with 3.4 million
gates, and sampling up to 125 MSPS. It will
replace the LUX board in Fall/2008.
The Event Builder Revision 0 was manufactured in
December 2007. It will be used to integrate the
LUX Digital Trigger. A more powerful Event
Builder is planned.