Sampling and Sample Size Calculation

About This Presentation

Title:

Sampling and Sample Size Calculation

Description:

Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole, Denise Antona -IDEA. Brigitte Helynck, Philippe Malfait, Institut de veille sanitaire ... – PowerPoint PPT presentation

Number of Views:3102

Avg rating:3.0/5.0

Slides: 50

Provided by: biagiop

Category:

more less

Transcript and Presenter's Notes

Title: Sampling and Sample Size Calculation

1
Sampling and Sample Size Calculation
Lazereto de Mahón, Menorca, Spain September 2006
Sources -EPIET Introductory course, Thomas
Grein, Denis Coulombier, Philippe Sudre, Mike
Catchpole, Denise Antona -IDEA Brigitte
Helynck, Philippe Malfait, Institut de veille
sanitaire Modified Viviane Bremer, EPIET 2004,
Suzanne Cotter 2005, Richard Pebody 2006
2
Objectives sampling

To understand
Why we use sampling
Definitions in sampling
Sampling errors
Main methods of sampling
Sample size calculation

3
Why do we use sampling?

Get information from large populations with
Reduced costs
Reduced field time
Increased accuracy
Enhanced methods

4
Definition of sampling

Procedure by which some members
of a given population are selected as
representatives of the entire population

5
Definition of sampling terms

Sampling unit (element)
Subject under observation on which information is
collected
Example children lt5 years, hospital discharges,
health events
Sampling fraction
Ratio between sample size and population size
Example 100 out of 2000 (5)

6
Definition of sampling terms

Sampling frame
List of all the sampling units from which sample
is drawn
Lists e.g. children lt 5 years of age,
households, health care units
Sampling scheme
Method of selecting sampling units from sampling
frame
Randomly, convenience sample

7
Survey errors

Systematic error (or bias)
Sample not typical of population
Inaccurate response (information bias)
Selection bias
Sampling error (random error)

8
Representativeness (validity)

A sample should accurately reflect distribution
of
relevant variable in population
Person e.g. age, sex
Place e.g. urban vs. rural
Time e.g. seasonality
Representativeness essential to generalise
Ensure representativeness before starting,
Confirm once completed

9
Sampling and representativeness
Sampling Population
Sample
Target Population
Target Population ? Sampling Population ? Sample
10
Sampling error

Random difference between sample and population
from which sample drawn
Size of error can be measured in probability
samples
Expressed as standard error
of mean, proportion
Standard error (or precision) depends upon
Size of the sample
Distribution of character of interest in
population

11
Sampling error
When simple random sample of size n is selected
from population of size N, standard error (s) for
population mean or proportion is s
p(1-p)
? n
n Used to calculate, 95 confidence intervals
Estimated 95 confidence interval
12
Quality of a sampling estimate
Precision validity
13
Survey errors example

Measuring height
Measuring tape held differently by different
investigators
? loss of precision
Large standard error
Tape shrunk/wrong
? systematic error
Bias (cannot be corrected afterwards)

179
178
177
176
175
174
173
14
Types of sampling

Non-probability samples
Probability samples

15
Non probability samples

Convenience samples (ease of access)
Snowball sampling (friend of friend.etc.)
Purposive sampling (judgemental)
You chose who you think should be in the study

Probability of being chosen is unknown Cheaper-
but unable to generalise, potential for bias
16
Probability samples

Random sampling
Each subject has a known probability of being
selected
Allows application of statistical sampling theory
to results to
Generalise
Test hypotheses

17
Methods used in probability samples

Simple random sampling
Systematic sampling
Stratified sampling
Multi-stage sampling
Cluster sampling

18
Simple random sampling

Principle
Equal chance/probability of drawing each unit
Procedure
Take sampling population
Need listing of all sampling units (sampling
frame)
Number all units
Randomly draw units

19
Simple random sampling

Advantages
Simple
Sampling error easily measured
Disadvantages
Need complete list of units
Does not always achieve best representativeness
Units may be scattered and poorly accessible

20
Simple random sampling

Example evaluate the prevalence of tooth decay
among 1200 children attending a school
List of children attending the school
Children numerated from 1 to 1200
Sample size 100 children
Random sampling of 100 numbers between 1 and 1200

How to randomly select?
21
EPITABLE random number listing
22
EPITABLE random number listing
Also possible in Excel
23
Simple random sampling
24
Systematic sampling

Principle
Select sample at regular intervals based on
sampling fraction
Advantages
Simple
Sampling error easily measured
Disadvantages
Need complete list of units
Periodicity

25
Systematic sampling

N 1200, and n 60
? sampling fraction 1200/60 20
List persons from 1 to 1200
Randomly select a number between 1 and 20 (ex
8)
? 1st person selected the 8th on the list
? 2nd person 8 20 the 28th etc .....

26
Systematic sampling
27
Stratified sampling

Principle
Divide sampling frame into homogeneous subgroups
(strata) e.g. age-group, occupation
Draw random sample in each strata.

28
Stratified sampling

Advantages
Can acquire information about whole population
and individual strata
Precision increased if variability within strata
is less (homogenous) than between strata
Disadvantages
Can be difficult to identify strata
Loss of precision if small numbers in individual
strata
resolve by sampling proportionate to stratum
population

29
Multiple stage sampling

Principle
consecutive sampling
example sampling unit household
1st stage draw neighborhoods
2nd stage draw buildings
3rd stage draw households

30
Cluster sampling

Principle
Sample units not identified independently but in
a group (or cluster)
Provides logistical advantage.

31
Cluster sampling

Principle
Whole population divided into groups e.g.
neighbourhoods
Random sample taken of these groups (clusters)
Within selected clusters, all units e.g.
households included (or random sample of these
units)

32
Example Cluster sampling
Section 2
Section 1
Section 3
Section 5
Section 4
33
Cluster sampling

Advantages
Simple as complete list of sampling units within
population not required
Less travel/resources required
Disadvantages
Potential problem is that cluster members are
more likely to be alike, than those in another
cluster (homogenous).
This dependence needs to be taken into account
in the sample size.and the analysis (design
effect)

34
Selecting a sampling method

Population to be studied
Size/geographical distribution
Heterogeneity with respect to variable
Availability of list of sampling units
Level of precision required
Resources available

35
Sample size estimation

Estimate number needed to
reliably measure factor of interest
detect significant association
Trade-off between study size and resources.
Sample size determined by various factors
significance level (alpha)
power (1-beta)
expected prevalence of factor of interest

36
Type 1 error

The probability of finding a difference with our
sample compared to population, and there really
isnt one.
Known as the a (or type 1 error)
Usually set at 5 (or 0.05)

37
Type 2 error

The probability of not finding a difference that
actually exists between our sample compared to
the population
Known as the ß (or type 2 error)
Power is (1- ß) and is usually 80

38
A question?

Are the English more intelligent than the Dutch?
H0 Null hypothesis The English and Dutch have
the same mean IQ
Ha Alternative hypothesis The mean IQ of the
English is greater than the Dutch

39
Type 1 and 2 errors

Truth
Decision H0 true H0 false
Reject H0 Type I error Correct decision
Accept H0 Correct Type II error
decision

40
Power

The easiest ways to increase power are to
increase sample size
increase desired difference (or effect size)
decrease significance level desired e.g. 10

41
Steps in estimating sample size for descriptive
survey

Identify major study variable
Determine type of estimate (, mean, ratio,...)
Indicate expected frequency of factor of interest
Decide on desired precision of the estimate
Decide on acceptable risk that estimate will fall
outside its real population value
Adjust for estimated design effect
Adjust for expected response rate

42
Sample size fordescriptive survey
Simple random / systematic sampling
z² p q
1.96²0.150.85
-------------- ----------------------
544
n
d²
0.03²
Cluster sampling
z² p q
21.96²0.150.85
n g
-------------- ------------------------
1088
d²
0.03²
z alpha risk expressed in z-score
p expected prevalence
q 1 - p
d absolute precision
g design effect
43
Case-control sample size issues to consider

Number of cases
Number of controls per case
Odds ratio worth detecting
Proportion of exposed persons in source
population
Desired level of significance (a)
Power of the study (1-ß)
to detect at a statistically significant level a
particular odds ratio

44
Case-controlSTATCALC Sample size
45
Case-control STATCALC Sample size

Risk of alpha error 5
Power 80
Proportion of controls exposed 20
OR to detect gt 2

46
Case-controlSTATCALC Sample size
47
Statistical Power of aCase-Control Study for
different control-to-case ratios and odds ratios
(with 50 cases)
48
Conclusions

Probability samples are the best
Ensure
Representativeness
Precision
..within available constraints

Sampling and Sample Size Calculation - PowerPoint PPT Presentation

Sampling and Sample Size Calculation

Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole, Denise Antona -IDEA. Brigitte Helynck, Philippe Malfait, Institut de veille sanitaire ... – PowerPoint PPT presentation