Title: Introduction to Survey Sampling
1Introduction to Survey Sampling
- Linda Owens
- Survey Research Laboratory
- University of Illinois
2Why Sample?
- Complete enumeration often impossible
- Greater speed, scope, accuracy
- Provide statistical data on wide range of
subjects - Classical statistical inference based on
assumption of random sample
3Developing and Evaluating Sample Designs
- How good must the sample be?
- Use of biased samples for screening
4Defining the Target Population
- What is the unit?
- Individuals or households, etc.
- Geography
- Age
- Other demographic variables
5Defining the Target Population (cont.)
- Sampling Frame
- Source from which sample will be drawn
- Examples
- Lists
- Phone numbers
- Maps
- Be aware of deficiencies
6Simple and Pseudo-Simple Random Sampling
- Probability samplesmeasurable chance of
selection - Simple random sample
- p of selection equal for all elements
- sampling done in one stage
- Random numbers and their use
- random ? haphazard
- random well-designed probability mechanism
7Simple and Pseudo-Simple Random Sampling (cont.)
- Random number tables
- Systematic Sampling
- How?
- calculate sampling interval
- pick random start
- take every ith case
- length measures
- periodicity
8Stratified Sampling
- Divide population into subgroups
- Sample from each subgroup
- Need population totals for each stratum
- Major use is comparison of subgroups
- Homogeneous subgroups
9Stratified Sampling (cont.)
- Poor uses of stratification
- Quota samples
- Make up for poor response in one stratum
- Post-stratification
- Weights
10Cluster Sampling
- Typically used in face to face surveys
- Sample population elements that are in close
proximity - Reasons for cluster sampling
- Reduction in cost
- No satisfactory sample frame available
- Clusters small enough to provide savings
11Multistage Samples
- Several stages of selection
- Area sampling
- face to face
- based on geography
- Selection of PSUs (primary sampling units)
- problems with random selection
- cant control sample size
- varying PSU sizes increase sampling variance
12Multistage Samples
- Sampling with probabilities proportionate to size
- Sampling PPS at several stages
13Random Sample of PSU's County Population
Sample Size 1 1000 2 5000 x
37 3 1500 4 9000 5 7500 6 15000
x 112 7 3500 8 6000 9 7400 10 500
x 4 11 35000 12 4400 13 8600 14 2400
x 18 15 9500 16 12500 17 65000
18 14700 x 110 19 21500 20 38000 Total
Population 268000 281 Select 5
PSU's Sample Size 500 Sample Interval
1 in 536 Sample Rate w/i each selected
county1/134 (1/41/1341/536)
Example
14Example
15How Big Should the Sample Be?
- Depends on variation in subject of interest
- Sample size not dependent on population size
- Formula for simple random samples
- based on confidence interval of proportion
- precisionCI(pq/n)1/2
- e.g. .051.96((.5.5)/n))1/2
16How Big Should the Sample Be (cont.)?
- Finite population correction (1-n/N)
- Design effects
- ratio of variance of a design to variance of SRS
- Analysis of subgroups
- Increase size to accommodate non-response
- Budget constraints
17Sampling Variance
- What are sampling variances?
- Repeated samples
- Variance between values of statistics observed on
separate samples - Sampling method affects size of variance
- Many software packages underestimate
- Wesvar, Sudaan, Stata can estimate