Title: SAMPLING
1SAMPLING
2Basic concepts
- Why not measure everything?
- Practical reason Measuring every member of a
population is too expensive or impractical - Mathematical reason Random sampling allows us to
test hypotheses using inferential (probability)
statistics - Population
- Largest group to which we intend to project the
findings of a study (e.g., every inmate in Jays
prison) - Parameter A statistic of the population e.g.,
mean sentence length - Sample
- Any subgroup of the population, however selected
- Samples intended to represent a population must
be selected in a way to make them
representative (will come up later) - Unit of analysis
- Persons, places, things or events under study
- The container for the variables
- Member or element of the population
- What we call a case once its been drawn into a
sample - Sampling frame
- A listing of all elements or members of the
population - Probability sampling
- Gold standard - every element (case) in the
population has the same chance of being included
in the sample - Random sampling is the most common probability
technique
Jays correctional institution
Population
Sample
3Sampling accuracy and error
- Representativeness Samples should accurately
reflect, or represent, the population from which
they are drawn - If a sample is representative, then we can
accurately make inferences (apply our findings)
to the population - We can simply describe the population
- Or we can test hypotheses and extend our findings
to the population - Warning we cannot generalize to other
populations only the population from which the
sample was drawn - Sampling error Unintended differences between a
population parameter and the equivalent statistic
from an unbiased sample - Inevitable result of sampling
- Try it out in class! Calculate the parameter,
mean age. Then take a random sample (more about
that later) and compare it to the sample
statistic. - Any difference between the two is sampling
error. It should decrease as sample size
increases - Rule of thumb To minimize sampling error sample
size should be at least 30 for populations up to
about 500 for larger populations sample size
should be greater
4RANDOM (PROBABILITY) SAMPLING
5Sampling process
- Sample with or without replacement?
- With replacement Return each case to the
population before drawing the next - Keeps the probability of being drawn the same
- Makes it possible to redraw the same case
- Without replacement Drawn cases are not returned
to the population - Probability of undrawn cases being selected
increases as cases are drawn - In social science research sampling without
replacement is by far the most common - Most sampling frames are sufficiently large so
that as elements are drawn changes in the
probability of being drawn are small - Sample simple or stratified? (examples on next
two slides) - In simple random sampling we randomly draw from
the entire population - In stratified random sampling we divide the
population into subgroups according to a
characteristic of interest - For example, male and female officers and
supervisors violent offenders and property
offenders - Can designate strata before or after sampling
- Proportionate Draw a sample from the population
without regard to strata, then stratify - Disproportionate (most common) Stratify first,
then draw samples of equal size from each
stratum
6Exercise - using simple random samplingto
describe a population
Population 200 inmatesMean sentence 2.94 years
Assignment Draw a random sample of 10 and
compare its mean to the population parameter.
Then do the same with a random sample of 30. How
much error is there? Does it change with sample
size?
Frequency ( prisoners)
Sentence length in years
7Exercise - using stratified random samplingto
describe a population
Population 200 inmates mean sentence 2.94 years
Assignment Draw a random sample of 30 from each
stratum and compare its mean to the corresponding
population parameters. How much error is there?
Violent crimes 50 Mean sentence 3.12
Property crimes 150 Mean sentence 2.88
8Exercise - using random samplingto test a
hypothesis
- Hypothesis A pre-existing personal relationship
between criminal and victim is more likely in
violent crimes than in crimes against property - You have full access to crime data for Sin City.
These statistics show that in 2014 there were 200
crimes, of which 75 percent were property crimes
and 25 percent were violent crimes. For each
crime, you know whether the victim and the
suspect were acquainted (yes/no). Â - Applying what we learned from the preceding two
slides - 1. Identify the population.Â
- 2. How would you sample?Â
- A. Would you stratify before or after?
- B. Which is better? Why?
9Stratified proportionate random sampling
Hypothesis A pre-existing personal relationship
between criminal and victim is more likely in
violent crimes than in crimes against property
Sin City200 crimes in 2014
50 violent (25 )
150 property (75 )
randomly select 30 cases (15 of the population)
(expect 7.5 violent 25)
(expect 22.5 property 75)
10Stratified disproportionate random sampling
Hypothesis A pre-existing personal relationship
between criminal and victim is more likely in
violent crimes than in crimes against property
Sin City200 crimes in 2014
50 violent (25 )
150 property (75 )
randomly select 30 cases from each category
30 property
30 violent
Compare proportions within each where suspect and
victim were acquainted (Note cannot combine
results)
11Exercise Using random sampling to test hypotheses
Hypothesis1 Gender affects cynicism
(two-tailed) Hypothesis2 Male cops are more
cynical than female cops (one-tailed)
- Sin City Police Department has 200 officers 150
are male and 50 are female. We wish to test the
above hypotheses. - 1. Identify the population.Â
- How would you sample?
- Would you stratify? In advance or later?
- Which is better? Why?
12Stratified proportionate random sampling
Hypothesis Gender affects cynicism
(two-tailed) Male cops are more cynical than
female cops (one-tailed)
50 female(25 )
Sin City200 officers
150 male(75 )
randomly select 30 officers
expect 7.5 females
expect 22.5 males
Compare average cynicism scores
Is there a problem? Hint how many females in
the sample?
13Stratified disproportionate random sampling
Hypothesis Gender affects cynicism
(two-tailed) Male cops are more cynical than
female cops (one-tailed)
Sin City200 officers
150 male (75 )
50 female (25 )
randomly select 30 officers fromeach stratum
30 males
30 females
Compare average cynicism scores Note dont
recombine these into a single sample!
14Sampling in experiments
- Making cops kinder and gentler
- The Anywhere Police Department has 200 patrol
officers, of which 150 are males and 50 are
females. Chief Jay wants to test a program thats
supposed to reduce officer cynicism. - Hypothesis Officers who complete the training
program will be less cynical - Dependent variable Score on cynicism scale (1-5,
low to high) - Independent variable Cynicism reduction program
(yes/no)
15Stratified disproportionate random sampling
Hypothesis officers who complete the training
program will be less cynical
population 200 patrol officers
150 males (75)
50 females (25)
For each group, pre-measure dependent variable
officer cynicism
Apply the intervention (apply the value of the
independent variable the program.)
NO YES
YES NO
For each group, post-measure dependent variable
officer cynicism
Also compare within-group changes what do they
tell us?
16OTHER SAMPLING TECHNIQUES
17Quasi-probability sampling
- Systematic sampling
- Randomly select first element, then choose every
5th, 10th, etc. depending on the size of the
sampling frame (number of cases or elements in
the population) - If done with care can give results equivalent to
fully random sampling - Caution if elements in the sampling frame are
ordered in a particular way a non-representative
sample might be drawn - Cluster sampling
- Method
- Divide population into equal-sized groups
(clusters) chosen on the basis of a neutral
characteristic - Draw a random sample of clusters. The study
sample contains every element of the chosen
clusters. - Often done to study public opinion (city divided
into blocks) - Rule of equally-sized clusters usually violated
- The neutral characteristic may not be so and
affect outcomes! - Since not everyone in the population has an equal
chance of being selected, there may be
considerable sampling error
18Non-probability sampling
- Accidental sample
- Subjects who happen to be encountered by
researchers - Example observer ride-alongs in police cars
- Quota sample
- Elements are included in proportion to their
known representation in the population - Purposive/convenience sample
- Researcher uses best judgment to select elements
that typify the population - Example Interview all burglars arrested during
the past month - Issues
- Can findings be generalized or projected to a
larger population? - Are findings valid only for the cases actually
included in the samples?
19Practical exercise
20Class assignment - non-experimental designs
- Hypothesis Higher income persons drive more
expensive cars - Income ? Car Value - Independent variable income
- Categorical, nominal studentor faculty/staff
- Dependent variable car value
- Categorical, ordinal 1 (cheapest),2, 3, 4 or 5
(most expensive) - Assignment
- Visit one faculty and one student lot.
- Select ten vehicles in each lot using systematic
sampling - Use the operationalized car values to code each
cars value - Give each team member a filled-in copy and turn
one in per team next week - We will complete the tables in class
- This assignment is worth five points
PLEASE BRING THESEFORMS TO EVERY CLASS SESSION!
21(No Transcript)