Title: Sampling
1Sampling
2Sampling Issues
Sampling Terminology
Probability in Sampling
Probability Sampling Designs
Non-Probability Sampling Designs
Sampling Distribution
3Sampling Terminology
4Two Major Types of Sampling Methods
- uses some form of random selection
- requires that each unit have a known (often
equal) probability of being selected - selection is systematic or haphazard, but not
random
Probability Sampling
Non-Probability Sampling
5Groups in Sampling
Who do you want to generalize to?
6Groups in Sampling
The Theoretical Population
7Groups in Sampling
The Theoretical Population
What population can you get access to?
8Groups in Sampling
The Theoretical Population
The Study Population
9Groups in Sampling
The Theoretical Population
The Study Population
How can you get access to them?
10Groups in Sampling
The Theoretical Population
The Study Population
The Sampling Frame
11Groups in Sampling
The Theoretical Population
The Study Population
The Sampling Frame
Who is in your study?
12Groups in Sampling
The Theoretical Population
The Study Population
The Sampling Frame
The Sample
13Where Can We Go Wrong?
The Theoretical Population
The Study Population
The Sampling Frame
The Sample
14Where Can We Go Wrong?
The Theoretical Population
The Study Population
The Sampling Frame
The Sample
15Where Can We Go Wrong?
The Theoretical Population
The Study Population
The Sampling Frame
The Sample
16Where Can We Go Wrong?
The Theoretical Population
The Study Population
The Sampling Frame
The Sample
17Statistical Terms in Sampling
Variable
18Statistical Terms in Sampling
Variable
1 2 3 4 5
responsibility
19Statistical Terms in Sampling
Variable
1 2 3 4 5
responsibility
Statistic
20Statistical Terms in Sampling
Variable
1 2 3 4 5
responsibility
Statistic
Average 3.72
sample
21Statistical Terms in Sampling
Variable
1 2 3 4 5
responsibility
Statistic
Average 3.72
sample
Parameter
22Statistical Terms in Sampling
Variable
1 2 3 4 5
response
Statistic
Average 3.72
sample
Parameter
Average 3.75
population
23Statistical Inference
- Statistical inference make generalizations about
a population from a sample. - A population is the set of all the elements of
interest in a study. - A sample is a subset of elements in the
population chosen to represent it. - Quality of the sample quality of the inference
- Would this class be a good representation of all
Persian Doctors? Why or why not?
This class would not be a good sample of all
Persian Dentists, we are more interested in
research methodology, so we are different!!
24The Sampling Distribution
25The Sampling Distribution
26The Sampling Distribution
Average
Average
Average
27The Sampling Distribution
Average
Average
Average
...is the distribution of a statistic across an
infinite number of samples
The Sampling Distribution...
28Random Sampling
29Types of Probability Sampling Designs
- Simple Random Sampling
- Stratified Sampling
- Systematic Sampling
- Cluster Sampling
- Multistage Sampling
30Some Definitions
- N the number of cases in the sampling frame
- n the number of cases in the sample
- NCn the number of combinations (subsets) of n
from N - f n/N the sampling fraction
31Simple Random Sampling
- Objective - select n units out of N such that
every NCn has an equal chance - Procedure - use table of random numbers,
computer random number generator or mechanical
device - can sample with or without replacement
- fn/N is the sampling fraction
32Simple Random Sampling
Example
- People who subscribe Novin Pezeshki last year
- People who visit our site
- draw a simple random sample of n/N
33Simple Random Sampling
List of Residents
34Simple Random Sampling
List of Residents
Random Subsample
35Stratified Random Sampling
- sometimes called "proportional" or "quota" random
sampling - Objective - population of N units divided into
non-overlapping strata N1, N2, N3, ... Ni such
that N1 N2 ... Ni N, then do simple
random sample of n/N in each strata
36Stratified Sampling
- The population is first divided into groups
called strata. If stratification is evident - Example medical students preclinical,
clerckship, internship - Best results when low intra strata variance and
high inter strata variance - A simple random sample is taken from each
stratum. - Advantage If strata are homogeneous, this
method is more precise than simple random
sampling of same sample size - As precise but with a smaller total sample size.
- If there is a dominant strata and it is
relatively small, you can enumerate it, and
sample the rest.
37Stratified Sampling - Purposes
- to insure representation of each strata -
oversample smaller population groups - sampling problems may differ in each strata
- increase precision (lower variance) if strata are
homogeneous within (like blocking)
38Stratified Random Sampling
List of Residents
39Stratified Random Sampling
List of Residents
surgical
Non-clinical
medical
Strata
40Stratified Random Sampling
List of Residents
surgical
Non-clinical
medical
Strata
Random Subsamples of n/N
41Systematic Random Sampling
Procedure
- number units in population from 1 to N
- decide on the n that you want or need
- N/nk the interval size
- randomly select a number from 1 to k
- then take every kth unit
42Systematic Random Sampling
- Assumes that the population is randomly ordered
- Advantages - easy may be more precise than
simple random sample - Example - Residents study
43Systematic Random Sampling
1 26 51 76 2 27 52 77 3 28 53 78 4 29 54 79 5
30 55 80 6 31 56 81 7 32 57 82 8 33 58 83 9 34
59 84 10 35 60 85 11 36 61 86 12 37 62 87 13
38 63 88 14 39 64 89 15 40 65 90 16 41 66 91 1
7 42 67 92 18 43 68 93 19 44 69 94 20 45 70 95
21 46 71 96 22 47 72 97 23 48 73 98 24 49 74 9
9 25 50 75 100
N 100
44Systematic Random Sampling
1 26 51 76 2 27 52 77 3 28 53 78 4 29 54 79 5
30 55 80 6 31 56 81 7 32 57 82 8 33 58 83 9 34
59 84 10 35 60 85 11 36 61 86 12 37 62 87 13
38 63 88 14 39 64 89 15 40 65 90 16 41 66 91 1
7 42 67 92 18 43 68 93 19 44 69 94 20 45 70 95
21 46 71 96 22 47 72 97 23 48 73 98 24 49 74 9
9 25 50 75 100
N 100
want n 20
45Systematic Random Sampling
1 26 51 76 2 27 52 77 3 28 53 78 4 29 54 79 5
30 55 80 6 31 56 81 7 32 57 82 8 33 58 83 9 34
59 84 10 35 60 85 11 36 61 86 12 37 62 87 13
38 63 88 14 39 64 89 15 40 65 90 16 41 66 91 1
7 42 67 92 18 43 68 93 19 44 69 94 20 45 70 95
21 46 71 96 22 47 72 97 23 48 73 98 24 49 74 9
9 25 50 75 100
N 100
want n 20
N/n 5
46Systematic Random Sampling
1 26 51 76 2 27 52 77 3 28 53 78 4 29 54 79 5
30 55 80 6 31 56 81 7 32 57 82 8 33 58 83 9 34
59 84 10 35 60 85 11 36 61 86 12 37 62 87 13
38 63 88 14 39 64 89 15 40 65 90 16 41 66 91 1
7 42 67 92 18 43 68 93 19 44 69 94 20 45 70 95
21 46 71 96 22 47 72 97 23 48 73 98 24 49 74 9
9 25 50 75 100
N 100
want n 20
N/n 5
select a random number from 1-5 chose 4
47Systematic Random Sampling
1 26 51 76 2 27 52 77 3 28 53 78 4 29 54 79 5
30 55 80 6 31 56 81 7 32 57 82 8 33 58 83 9 34
59 84 10 35 60 85 11 36 61 86 12 37 62 87 13
38 63 88 14 39 64 89 15 40 65 90 16 41 66 91 1
7 42 67 92 18 43 68 93 19 44 69 94 20 45 70 95
21 46 71 96 22 47 72 97 23 48 73 98 24 49 74 9
9 25 50 75 100
N 100
want n 20
N/n 5
select a random number from 1-5 chose 4
start with 4 and take every 5th unit
48Cluster Sampling
- The population is first divided into clusters
- A cluster is a small-scale version of the
population (i.e. heterogeneous group reflecting
the variance in the population. - Take a simple random sample of the clusters.
- All elements within each sampled (chosen) cluster
form the sample.
49Cluster Random Sampling
- Advantages - administratively useful, especially
when you have a wide geographic area to cover - Example Randomly sample from city blocks and
measure all homes in selected blocks
50Cluster Sampling vs. Stratified Sampling
- Stratified sampling seeks to divide the sample
into heterogeneous groups so the variance within
the strata is low and between the strata is high. - Cluster sampling seeks to have each cluster
reflect the variance in the populationeach
cluster is a mini population. Each cluster is
a mirror of the total population and of each
other.
51Multi-Stage Sampling
- Cluster random sampling can be multi-stage
- Any combinations of single-stage methods
52Multi-Stage Sampling
- choosing students from medical schools
- Select all schools, then sample within schools
- Sample schools, then measure all students
- Sample schools, then sample students
53Nonrandom Sampling Designs
54Types of nonrandom samples
- Accidental, haphazard, convenience
- Modal Instance
- Purposive
- Expert
- Quota
- Snowball
- Heterogeneity sampling
55Accidental or Haphazard Sampling
- Man on the street
- Medical student in the library
- available or accessible clients
- volunteer samples
- Problem we have no evidence
- for representativeness
56Convenience Sampling
- The sample is identified primarily by
convenience. - It is a nonprobability sampling technique. Items
are included in the sample without known
probabilities of being selected. - Example A professor conducting research might
use student volunteers to constitute a sample.
57Convenience Sampling
- Advantage Relatively easy, fast, often, but not
always, cheap - Disadvantage It is impossible to determine how
representative of the population the sample is. - Try to offset this by
- collecting large sample
- size.
58Modal Instance Sampling
- Sample for the typical case
- Typical medical students age?
- Typical socioeconomic class?
- Problem may not represent the modal group
proportionately
59Purposive Sampling
- Might sample several pre-defined groups (e.g.,
patients who does not attend at follow up visits) - Deliberately sampling an extreme group
60Expert Sampling
- Have a panel of experts make a judgment about the
representativeness of your sample - Advantage at least you can say that expert
judgment supports the sampling - Problem the experts may be wrong
61Quota Sampling
- select people nonrandomly according to some quotas
62Snowball Sampling
- one person recommends another, who recommends
another, who recommends another, etc. - good way to identify hard-to-reach populations
- for example, adolescents who abuse recreational
drugs
63Heterogeneity Sampling
- make sure you include all sectors - at least
several of everything - don't worry about
proportions (like in quota sampling) - for instance, when brainstorming issues across
stakeholder groups
64Sampling
Random
Non Random
Haphazard
Simple
Convenience
Systematic
Modal Instance
Cluster
Purposive
Multi Stage
Expert
Stratified
Snowball
Proportionate
Disproportionate
Heterogeneity
Quota
65Any question?