Title: Muon Global Run Quality
1Muon Global Run Quality
- OUTLINE
- Results from May 1st to May 31st and prospects
for June 2004. - Muon Event Quality
- Muon Lumy Block Quality
2With
- Victor Abazov, Fritz Bartlett, Victor Bezzubov,
Dmitri Denisov, Gaston Gutierrez, Al Ito, Rob
McCroskey, Martijn Mulders, Geoff Savage, Dennis
Shpakov, Stefan Soldner, Linda Stutte, Jeff
Temple, Valerie Tokmenin, Markus Wobisch, Yuri
Yatsunenko
Experts on the muon detector hardware, muon L1
trigger, muon ID, and controls and monitoring.
3Muon Run Quality Updated
2002 to 2004
- May 2004 has 35.3 pb-1
- 2.5 pb-1 Bad
- Toroid Magnet (1.8 pb-1)
- Miscellaneous (0.7 pb-1)
- 2002 18.2 bad runs
- 2003 2.6 bad runs
- 2004 2.0 bad runs (was 0.3 before May)
(all these are luminosity weighted).
4Toroid Operation Failure
- The Problem
- During Toroid Magnet Reversal the reversing
switch stops before contact is made with the bus
to the coils. Current is shorted. - Current appears normal (1500 A), voltage required
to produce it is low (2V instead of 90 V). - Discovered when coil cooling water was colder
then expected. - Impact is that no toroid magnet field was very
small. - The Run Ranges
- May 18th to May 19th 193107-193111 1.7 pb-1.
- June 9th 193834-193837 0.8 pb-1.
5Toroid Operation Failure
- The Main Symptoms in the data (graded Bad)
- Local muon momentums will be overestimated but
not infinite - Multiple scattering in the calorimeter and toroid
causes deflection q 1/p. - RecoCert Plot is local muon pT and h in toroid
problem run and normal run.
6Toroid Operation Failure
- New alarms added on the Toroid Magnet voltage
should decrease amount of data recorded in this
condition.
7Prospects for June 2004
- More than 41 pb-1 delivered so far this month.
- Bad Things
- Toroid magnet problem 0.8 pb-1
- A coincidence of dead PDTs on June 16th to 17th
(due to dsp code test?) affected runs 194073 to
194135. Fixed on controlled access.
8Muon Event Quality
- Muon ID has content and access methods for muon
error words By Crate. See - www-d0.fnal.gov/phys_id/muon_id/d0_private/certif/
p14/index.html - Xing (not turn ) !_at_(
- RVS (a weak hardware checksum)
- SRQ error (but we dont write such events)
9Muon Event Quality
- Gavin H. looked at data from run 173521 to 180965
(Feb to Oct 2003). - 5.6 of runs had some problem.
10Muon Event Quality
- Gavin says 1.5 of events showed some problem
- Thats more than I thought. We dont see that in
the control room, typically. - Lots of these occurred in run 179141. Problem
with the central scintillator data occurred then.
We will be changing this runs quality score. - On the general question of what to do
- Im not sure we should trivially reject events
with the problem. - There is a lot of redundancy in part of the muon
system. - Cases we might reject the event include
- Central scintillator because it reads out AC in
one crate - Suggests we should include module-level
information - We need to get some experience with this
information before we make a recommendation.
11Muon Lumy Block Quality
- Plan to compile information form the SES and mark
luminosity blocks in the database that have
run-pausing alarms. - No progress to report.
12Summary
- The Muon Good Run lists can be found in the RUNS
QUALITY DATABASE. - May 2004 has 2.5 pb-1 Bad muon runs out of 35
pb-1. - Toroid magnet problems was the main cause and we
modified the alarms so that this problem will be
avoided. - Use at least reasonable grade runs.
- We have event level data quality in tmbs and are
beginning to use it in muon_id. - We need to get some experience with it before we
can say how errors impact the physics.
13- My talk ends here.
- Slides further in are in long-term storage.
14Procedure
- I finish within a few days of the end of a given
month (See D0Note 3938). - Use runs query database to produce a list of runs
that were taken with trigger global. - Verify whether or not all readout crates were
part of the run. - Read muon, and sometimes captain, global monitor,
and daq logbooks for signs of problems or fixes.
Check examine plots. - Check the L1 trigger cross sections, the number
of muon triggers, and the fraction of events
which came from muon triggers in each run. - I dont really check L2 or L3. That should be
another pair of lists. - Check with experts from all muon subsystems.
- Input from folk looking at Reco output.
- At times parts of this process are automated via
python scripts. - Enter into the Run Quality Database. CD are
reasonable. Lower grades are Bad.
15Run Grades
- C is grade for runs with no known missing
(front-end) modules, high voltage problems,
serious sync problems, etc - D if not a C
- so long as the data determination of effys using
Tight and Loose muons could provide a
measurement of the effy of all of the components
of muon local tracking criteria. That is - S is for special runs, especially those that are
from global trigger lists. - Otherwise F.
- Missing readout crates, trigger off, magnets
off, etc.
16A B Quality Runs
- A or B is not applicable, yet. Getting no closer!
- Event level raw data integrity flags in the
thumbnail. Done by muon algorithms people as of
Beaune. - Required to get this into RecoCert
- The latency loophole in alarm system persists.
Solved Michael B. and Geoff S. Prior to Beaune
we documented and checked that we have
run-pausing alarms covering important muon
failures - But at the moment info is in flat files located
on SAM (data-tier significant-event) and for
recent data on disc at /online/log/ses -gt SAM). - SES run-pausing alarms need to go to Lumy
database\ - More on this later in the talk.
17Bad run causes 11/22/03 04/30/04
- Numbers of occurrences.
- These may span many runs in sample 144 pb-1.
- PDTs Missing PDT crate x34 18 nb-1 on May 8th.
- L1 trigger cross section too high or too low
once for lt1 nb. 11/26/2003, a 293 event run. A
couple other instances amounting to 1 or 2 nb-1. - L1Mu triggers out-of-sync for a big fraction of
run 192577 on May 3rd 367 nb-1. A candidate for
quality by lumy block. - MDTs missing readout crate once for 23 nb-1
01/01/2004. - Missing MDT Crate 3 nb-1 on March 9th.
- Forward Central scint. missing readout crate
once for 428 nb-1. Failed front-end crate lvps
01/18/2004.
18Warnings Since Sept. 2003
- The Run Quality Database (muon part) handles
- Muon L1 Trigger (muon part)
- But NOT the Muon L2, Muon L3, or CTT.
- You have to measure those on your own.
- Run 192009 to 192015 inclusive central muon L2
was rejecting all muon triggers. If we scored on
L2 wed apply F.
19SES Alarms Example
- There was a Solenoid magnet quench on Feb 17,
2004 in run 189395. L/R (0.5H/50x10-3 Ohms) gt
takes 10 seconds for the field to drop 1/e.
Field dropping from 4800 A to 0 A.
- It took us 1s to alarm on the change. In that
time the field changed by about 6 .
20SES Alarms Example
- I looked in /online/log/ses/se_log.20040217-00000
0CST and found the alarm in LBN 3099452 (one
more than the LBN in which it occurred). I
searched for field run_paused 189395.
grep run_paused se_log.20040217-000000CST grep
189395 v4 f() 1077045274.85 run
ONL_RUN_189395/run_paused 0 d0olc.fnal.gov 0 none
none none none none run_paused 189395 3099452
'Comment' 'Run paused automatically.',
'Pause_Reason' 'DMAX_MAG_SOL/AMPS', 'autopause'
1, 'comics_runtype' 'data', 'configname'
'official/global_CMT-12.34', 'physics' 1,
'runtype' 'physics', 'sdaq_type' 'MONITOR'
- I could also have looked in /online/log/ses/lbn
files, a subset of all SES messages for
run-pausing alarms from April 2002 to end of Oct.
2003. 15 magnet monitoring failures of which 3
are magnet quenches. Also, 133 muon hv trips
1139 total run-pausing alarms. - This stuff to go to Lumy database by LBN and
system.
21L1 Triggers, For Instance
Run 176185 D
Run 176211 F
Run 176538 F