Title: Selecting Your Sample
1Selecting Your Sample
2Determining the Sample Size
- To calculate the specific sample size required
for your site you need the following information - Level of Confidence
- Acceptable margin of error
- Determine the starting or baseline levels of the
indicators (diabetes if doing Step 3) - Design effect
3Level of Confidence
- The level of confidence for cross sectional
studies is 95. - For example, if you have a prevalence of 12.0
and a 95 CI of (11.0 13.0), then you would be
able to say that in 95 out of every 100 studies,
the prevalence of factor X would be between 11.0
and 13.0. - 95 CI translates into 1.96 for calculating
sample size.
4Acceptable Margin of Error
- The margin of error is the difference between the
population proportion and the sample proportion. - The standard choices range from 1 and 5 of the
population.
5Determining the Baseline Level for Indicators
- Select the lowest prevalence of the indicators
that you are surveying. - If you are doing Step 3, this is most likely
going to be diabetes.
6Design Effect
- Sample size is commonly calculated assuming a
simple random sample. - Provides a correction for the loss of sampling
efficiency from using cluster sampling. - If no information on previous surveys is
available, assume a DE of 2.
7Create Table of Information
8Equation for Calculating
Sample Size
n
9Equation for Calculating
Z Level of confidence P Baseline level for
indicator E Margin of error
n
10Create Table of Information
Z 1.96 E 0.05 P 0.10
11Equation for Calculating
Z1.96 E0.05 P 0.1
n
12Equation for Calculating
138.3
Sample Size
13Equation for Calculating
138.1
Number will only change if the sample size
exceeds 5 of the total population.
14Including Age-Sex Stratum
Sample size 138.1
- Take into account
- that we want to make age-sex comparisons
- cluster sampling
Sample size (number age-sex stratum) Design
effect
Sample size 138.1 8 2 2,211
15Adjust Sample Size for Non Response
- Sample Size as calculated (2,211) adjusts for
cluster sample design. - Assume 80 response rate and inflate sample size
and accordingly. - (2,211.20) 442
- Final Sample size 2,211 442 2,653
16STEPS Sampling Spreadsheet
17Add Districts to PSU Spreadsheet
- Add names of districts and the estimated size of
each district.
18Select Number of Clusters
- Type in the number
- of clusters to select.
- Type in the random number.
19Selected Districts
20Population Distribution
21Sample Distribution
Sample Total sample size Proportion of
population
Total Castries 2,653 0.5511
22Determine Cluster Size
- Assume 50 participants per cluster
23Clustering SSU Spreadsheet
- Allows selection of settlements by PPS sampling
- Duplicate Spreadsheet until have five
spreadsheets (number of districts selected)
24Record Names of Selected Districts
- Will see spreadsheets at the bottom of the
workbook. - Rename the spreadsheets to reflect the names of
the selected districts.
25Select Settlements for Each District
- Select Anse La Raye settlements.
- Need to select three (3) clusters.
26Type in Settlements and Estimated Population Size
- Label the settlements with either numbers or
names. - Type in the estimated size of the sampling units.
27Determine the Number of Clusters
- Type in the number
- of clusters to select.
- Type in the random number.
28Anse La Raye
- Cluster 3 is associated with Au Tabor Hill.
- Au Tabor Hill has less than 50 households, so
combine it with settlement 4 and select 50
households at random from the combined list.
29Anse La Raye
- Cluster 15 is associated with Millet Caico.
- Millet Caico has 64 households.
30Select the Settlements for the Remaining Districts
- Enter the settlement information for each
district into the corresponding spreadsheet. - Use the table below to determine the number of
clusters to select
31Select Households from Each Settlement
- Use the Rand Hhold spreadsheet to select
households to sample randomly. - Determine the number of households in each
settlement - If you have an address list with each household
by address. - Otherwise, when you are visiting a settlement,
list all the households by location.
32Duplicate Rand Hhold for Number of Settlements
Selected
- There are 52 clusters in the sample.
- Duplicate the Rand Hhold spreadsheet 51 times
using the Duplicate Spreadsheet button.
33Rename Rand Hhold for Selected Settlements
- You will see spreadsheets at the bottom of the
workbook - Rename the spreadsheets to reflect the names of
the selected districts
34Select Households for each Settlement
- Determine how many households there are in each
settlement. - List all the households on the Rand Hhold
spreadsheet and randomly select 50 households. - One participant will be selected from each
household.
35Millet Caico Rand Hhold
- Millet Caico has 64 households, of which 50
will be selected.
36Sampling Options
- This example has shown only one possibility.
There are many other ways to draw and divide the
same population. - You could divide the sample into constituencies
instead of districts. - You could select all the districts to sample and
then select a smaller, more representative group
from all the districts.
37Determining Your Sample
- Your sample will be a balance between
- What information is available (sampling frame)
- The scope of the survey
- The budget (finance and human resources)
38Statistical Assistance
- You should always have someone review your
sampling methodology. - If you need assistance, contact CAREC or PAHO or
WHO-Geneva.