Title: Analysis of Surveillance Data
1Analysis of Surveillance Data
Arnold Bosman, 2006
- Source Denis Coulombier, WHO
2Steps in Surveillance Analysis
- Data quality
- Descriptive analysis
- Time
- Place
- Persons
- Generate hypothesis
- Test hypothesis
3Data Quality Issues
- Missing values
- Attraction to round figures
- Data entry errors
- Bias related to lack of representativity
- Cases more severe
- Urban gt rural
- Source not represented (private sector, GPs)
4Notifications of All Notifiable Diseasesby Date
of Onset, USA, 1989
5Measles and ARI by Month, Haiti, 1992-1993, 38
Sentinel Sites
Cases X 1000
5
4
3
2
Measles
1
0
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
92
93
6Analysis of time characteristics
7Descriptive Analysis of Time
- Graphical analysis
- Requires aggregation on appropriate time unit
- Choice of the time variable
- Date of onset
- Date of notification
- To describe trend, seasonality, and residuals
- Use of rates when denominator changes over time
8Descriptive Analysis of TimeGraphical analysis
9Descriptive Analysis of TimeComponent of
Surveillance Data
10Descriptive Analysis of TimeSmoothing Techniques
11Notification of giardiasis in Delaware,
03/1991-03/1995
12Effect of the Moving Average Window Size
Weekly Notifications of Salmonellosis, Georgia,
1993-1994
13Cases of Gonorrhea in Michigan
Week 10 of 1994 and 208 Previous Weeks
14Descriptive Analysis of TimeSize of the Moving
Average Window
- Showing seasonality smooth residuals
- Empirical approach
- Window increases with variance
- 5 to 15 weeks
- Showing trend smooth residuals and seasonality
- 52 weeks
15Malaria- By year, United States, 1930-1992
1000
Cases/100000 population
Relapse -Overseas cases
100
Relapses, Korean veterans
10
Vietnam veterans
Semi-log scale
Immigration
1
0.1
0.01
1931
1936
1941
1946
1951
1956
1961
1966
1971
1976
1981
1986
1991
120
Cases/100000 population
Year
100
80
Relapse - Oversea cases
Arithmetic scale
60
Relapses, Korean veterans
40
Vietnam veterans
Immigration
20
0
1931
1936
1941
1946
1951
1956
1961
1966
1971
1976
1981
1986
1991
Year
16Testing for Time Hypothesis
- Remove confounding (rates)
- Removing time dependency
- Trend and seasons
- By restriction or modelling
- Test for detection of outbreaks
- More cases than expected?
- Test for changes in trend
- Departure from historical trend?
17Accounting for Time DependencyAirline Passengers
in the US, Monthly Data, 1949 - 1960
700
Is the red dot consistent with the data?
600
500
400
300
200
100
0
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
18Tests not accounting for time dependencyMean
1.96 Standard Deviations
700
Yes
600
95 CI
500
400
Mean
300
200
100
0
-10
10
30
50
70
90
110
130
150
Randomly ordered data
19Tests accounting for time dependency
Chronologically ordered data
0,3
95 CI
0,2
0,1
0,0
Mean
-0,1
-0,2
-0,3
-0,4
1
12
23
34
45
56
67
78
89
100
111
122
Month
Residuals, after removing trend and seasonality
20Statistical Tests for Time Series
- For time series with no trend and seasonality
random series - Tests not accounting for time dependency
- Chi square, Poisson
- For time series with seasonality and no trend
- Tests accounting for TD by restriction
- Similar historical period mean/median
- For all time series
- Tests accounting for TD by modeling
- Linear regression corrected for seasonor
- Fourier analysis and SARIMA models
21Olympic Games Surveillance, Athens 2004Septic
Shocks, Syndromic Surveillance
- Poisson test
- Count of cases/average previous 7 days (l)
between 1-4
lt1
P-value
22MMWR Figure 1 Accidental variations?
Mean and standard deviation
Test
Can be used with median and percentiles, Better
to reduce effect of past epidemics
23Thresholds Based on Median and PercentilesDiarrho
ea in Madaba district, Jordan, 2000-2001
- Accounting for TD by restriction
- 5 weeks centred around current week, past 5 years
(25 weeks) - 5th and 95th percentile threshold
5 week historical periods 5
Current week
95th perc.
Forecast
5th perc.
Historical period
52 week forecast
24Comparison expressed in SD between notifications
of weeks 31/97 to 34/97and previous 5 years,
same period, France
Probability of observing such a departure from
historical data
gt 10
lt5
gt5 lt 10
Botulisme
Brucellosis
Typhoid-parat. fever
Legionellosis
Meningococcemia
Aids
Foodborne outbreaks
Tuberculosis
Tetanus
0
0,5
1
1,5
2
2,5
3
3,5
4
4,5
5
-0,5
-1
-1,5
-2
-2,5
-3
-3,5
-4
-4,5
-5
-5,5
Alert Area 1.65
Z-score
25Notification of Food borne Outbreaks in France,
1995-1998
26Interpreting the results
- Role of chance
- Role of bias
- True disease pattern
27Conclusions
- Analysis to draw attention
- Validation by investigation