Basic Statistics in Public Health - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Basic Statistics in Public Health

Description:

Basic Statistics in Public Health – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 32
Provided by: swlondonpu
Category:

less

Transcript and Presenter's Notes

Title: Basic Statistics in Public Health


1
Basic Statistics in Public Health
2
  • Types of data
  • Descriptive statistics
  • Confidence Intervals

3
TYPES OF DATA
Age?
Sex?
Social class?
4
TYPES OF DATA
BMI?
Obese or not?
Underweight / normal / overweight / obese?
5
Type of data?
Source Compendium of Clinical and Health
Indicators / Health Surveys for England
6
SUMMARISINGDATA
7
Summarising Numerical Data
  • Measures of central tendency / location-
  • Mean
  • Median
  • Mode
  • Measures of spread / variability-
  • Range
  • Interquartile range
  • Variance
  • Standard deviation

8
  • Mean? Median?
  • 3, 4, 5, 6, 7
  • 9, 10, 20, 21
  • 1, 2, 3, 4, 990

9
Mean or Median?
  • For symmetric data, meanmedian
  • Choose the mean (easily understood, better
    statistical properties)
  • For skewed data, mean is drawn towards the tail
    of distribution
  • Median can be better reflection of centre of data

10
Positively skewed data ...
Median2
11
Negatively skewed data ...
Median40
12
And the Mode?
  • Mode most frequently occurring value
  • Hardly ever used
  • Depends on how the data are grouped, and not
    always unique

13
The Range
  • Range maximum - minimum
  • In practice, usually present as (min, max)
  • Poor measure of spread-
  • very dependent on sample size
  • affected by outliers
  • but people often want to know it! present as an
    extra

14
Interquartile Range (IQR)
  • Rank data from smallest to largest
  • Lower Quartile has 1/4 of values smaller than it
  • Upper Quartile has 1/4 of values larger than it
  • Interquartile Range Upper Quartile - Lower
    Quartile
  • But usually written (Lower Quartile, Upper
    Quartile)
  • Better than range - not influenced by outliers or
    sample size

15
Interquartile Range (IQR) - Note
  • Note - quartiles are, strictly, observations
  • There are 3 quartiles -
  • lower quartile
  • ?
  • upper quartile
  • But the word Quartiles is now often used to
    mean Quarters - i.e. the 4 groups of ranked
    observations
  • Similarly Quintiles is often used to mean
    Fifths i.e. the 5 groups of ranked
    observations

16
Variance
  • Step 1 Calculate Deviations the difference
    between each observation and the mean of the data
  • Step 2 Square these Deviations
  • Step 3 Average the Squared Deviations
  • this is the Variance
  • (Strictly, divide by n-1, not n)

17
Standard Deviation (SD)
  • Step 4 Take the square root of the Variance
  • this is the Standard Deviation

This returns the statistic to the same units as
the data
  • Both Variance and Standard deviation use all of
    the data
  • But as a result, can be over-influenced by
    outliers

18
Symmetric data ...
  • Summarise using Mean and Standard Deviation

Skewed data ...
Summarise using Median and Interquartile Range
19
Useful Fact
If a dataset is Normally distributed, or at least
fairly symmetrical, then the central 95 of the
data will be included in the range Mean /- 2
Standard Deviations Sometimes called a
reference range or normal range Strictly
Mean /- 1.96 Standard deviations
20
Central 95 of data
2.5
2.5
mean 2SD mean
mean 2SD
21
CONFIDENCEINTERVALS
22
Obesity data, England, 2006
  • Age-standardised percentage obese 24.1
  • 95 Confidence Interval 23.2 to 25.0

?
23
Sample estimates of Population values
  • The obesity data was based on a sample
  • But has this sample given the right answer?
  • First need to eliminate bias, e.g. take a random
    sample
  • But even when samples are unbiased, different
    samples will still give different answers - this
    is known as sampling error or random variation

24
Would like to know, How imprecise might the
sample estimate be, just as a result of sampling
variation? i.e. How far away might the sample
estimate be from the true population value?
  • Depends on
  • Sample size
  • Variability of data (SD)

25
A 95 confidence interval provides a measure of
the precision of a sample estimate-
There is a 95 probability that the true
population value lies within the 95 confidence
interval.
Narrow 95 CI precise estimate Wide 95
CI imprecise estimate
26
  • Age-standardised percentage obese 24.1
  • 95 Confidence Interval 23.2 to 25.0

We are 95 confident that the true
age-standardised percentage obese for England,
2006, is somewhere between 23.2 and 25.0.
27
FOR DISCUSSION
  • Why do we calculate Confidence Intervals when our
    estimates are based on total population data,
    e.g. SMRs for cancer?

28
Presenting 95 Confidence Intervals on graphs
Self-reported smoking status in women (), by
ethnic group with 95 confidence intervals
(England, 2004)
29
Interpreting 95 Confidence Intervals from graphs
  • What can you say about the true smoking
    prevalence for the general population?
  • For which ethnic groups is the prevalence of
    smoking significantly different from 25?
  • Is the prevalence of smoking significantly
    different between the Black Caribbean and Black
    African populations?
  • Is the prevalence of smoking significantly
    different between the Pakistani and Bangladeshi
    populations?

30
Note In general, it is better to perform a
statistical significance test, than look for
overlapping or non-overlapping confidence
intervals!
31
Food for thought
  • What is the difference between a 95 confidence
    interval and a 95 reference range?
Write a Comment
User Comments (0)
About PowerShow.com