Title: The Logic of Sampling
1Chapter 7
2What is sampling?
- Sampling is the process of systematically
selecting observations for study
3Why is sampling important in social science?
- From the viewpoint of sampling, whats the
difference between a chemist studying the atomic
properties of methane and a political scientist
studying the political opinions of the American
electorate?
4Other reasons for sampling
- The population is very large, and resources
(e.g., time, money, personnel) are insufficient
to collect information from all members. - It is impossible to gain access to every member
of the population. - Research is in exploratory or pretesting phase.
5Two types of sampling designs
- (Sampling design - a plan describing how a study
sample will be chosen). - 1. Nonprobability sampling design a sample that
does not use mathematical probability theory in
its design. - 2. Probability sampling design a sample
selected in accordance with mathematical
probability theory, typically involving some
random-selection mechanism
61. Nonprobability sampling designs
- (1) Reliance on available subjects
- (2) Snowball sampling
- (3) Purposive or judgment sampling
- (4) Voluntary response sampling
- (5) Quota sampling
7(1) Reliance on available subjects
- The researcher chooses units on the basis that
they are easily accessible to study - Also called availability sampling or convenience
sampling - Used in pretesting, exploratory research, or
situations involving limited resources - Even when justified on the grounds of
feasibility, researchers should not generalize
from the data. - Example Kings County Schools (CA)
8(2) Snowball sampling
- The researcher starts with one person or larger
unit, then asks that person to name others who
would be interested in participating, locates
those people and asks them to suggest others,
etc. - Most appropriate when members of a special
population are difficult to locate, but a few are
known. - Used primarily for exploratory purposes, since
representativeness is questionable.
9(3) Purposive or judgment sampling
- Researcher selects units into the sample on the
basis of her/his judgment about which ones will
be the most useful or representative - Used for pretesting, exploratory research, or
instances in which certain types of units need to
be studied for theoretical purposes - Example Deviant cases sample - study cases that
dont fit into the more regular pattern to gain
insight into the pattern. - Example Sampling radical right and radical left
political organizations to study the extremes of
political thought
10(4) Voluntary response sampling
- Researcher uses units who volunteer to be members
of the sample. - Sample members respond to an appeal from the
researcher to become part of the study. - Examples Kings County Schools (CA)
11(5) Quota sampling
- Researcher selects units into the sample on the
basis of characteristics of the target population
(specified in the quota matrix), so that the
total sample will have the same distribution of
characteristics as the target population. - However, units are NOT selected randomly from
within cells of the quota matrix.
12Example of a simple quota matrix (One variable -
Gender)
13Example of more complex quota matrix (Two
variables Gender and Educational level)
14(5) Quota sampling - problems
- Since the selection of units within cells of the
quota matrix is not random, biases may exist in
the sample. - If there are errors or omissions in the list of
target population units, the quota matrix may not
accurately depict the characteristics of the
population
152. Probability sampling designs
- Advantages
- Probability samples avoid conscious or
unconscious sampling bias. - Sampling bias - those selected for the sample are
not representative or typical of the larger
population, often because of some subjective bias
of the researchers. - Probability samples permit quantitative estimates
of the degree to which a sample is likely to
represent the population.
16Probability sampling terminology
- Element the unit of analysis, the unit about
which information is collected and that provides
the basis for the analysis. - Population the theoretically specified
aggregation of elements. - Study population the collection of elements
from which the sample is actually selected.
17Population vs. Study population
18Probability sampling terminology
- Sampling frame the list of elements in the
study population from which the sample is
actually selected. - Parameter the summary description of a given
variable in a population. - Statistic the summary description of a given
variable in a sample.
192. Probability sampling designs
- (1) Simple random sampling
- (2) Systematic sampling with a random start
- (3) Stratified random sampling
- (4) Multistage cluster sampling
20(1) Simple random sampling
- EPSEM ("Equal Probability of Selection Method") -
each element on the sampling frame has an equal
chance of selection, and the selection of any one
element is independent of the selection of any
other element. - Examples Kings County (One physical and one
computer technique). - Possible problems with physical techniques Draft
Lottery of 1970
21How to draw a simple random sample
- On the L drive, open
- \faculty\jhonnold\BABBIE BASICS\n90id.sav
- Using (1) Table of random numbers (next slide)
22Table of random numbers
23How to draw a simple random sample - SPSS
- Change random number seed
- Transform Random number seed (Pick any number
between 1 and 2,000,000,000) - Select the sample
- Data Select cases Random sample of cases
Sample - Exactly 10 cases from the first 90 cases
24(2) Systematic sampling
- Every kth (e.g., every 4th, every 10th) element
in the sampling frame is chosen (systematically)
for inclusion in the sample after a random start
in the first sampling interval. - Sampling interval (k) - the standard distance
between elements selected into the sample
population size/sample size (e.g.,
10,000/50020). - Sampling ratio (1/k) - the proportion of elements
in the sampling frame that are selected into the
sample sample size/population size (e.g.,
500/10,0001/20).
25- After the sampling interval is determined, make a
random start in the first sampling interval, then
select every kth element until reaching the end
of the sampling frame. - Example of systematic sampling from the Kings
County site. - On the L drive, open
- \faculty\jhonnold\BABBIE BASICS\n90id.sav
26(3) Stratified random sampling
- The researcher chooses characteristics of the
population to represent accurately, without error
(e.g., age, gender, class standing), divides the
population into homogeneous groups (strata) on
the basis of these characteristics, then samples
randomly from the strata.
27How to draw a stratified random sample
- Decide which population characteristics you want
to represent accurately. - Ideally, these will be characteristics that would
affect the results, if they were not represented
accurately. You will be limited to
characteristics on which you have actual
population data. - (2) Divide the sampling frame into strata
(groups) on the basis of these characteristics - (3) Randomly sample the desired number from each
group, so that each group is represented in the
sample in proportion to its representation in the
population.
28Stratification by age
29Quota vs. stratified sampling
Quota
Stratified
30Examples of stratified random sampling
- Example from Kings County.
- On the L drive, open
- \faculty\jhonnold\BABBIE BASICS\n90sort.sav
31(4) Multistage cluster sampling
- Used when a list of all of the elements in the
study population does not exist (e.g., the
American adult population, all college students
in the U.S.). - In this case, types of probability sampling that
rely on a sampling frame of elements cannot be
used.
32Steps in multistage cluster sampling Two stages
example
- Sampling frame of clusters Naturally occurring
groups (clusters) of elements in the population
are listed, and a sample of clusters is selected
using a probability method - Sampling frame of elements From the selected
clusters, lists of elements are developed, and a
sample of elements is selected from each selected
cluster using a probability method.
33Advantages/Disadvantages of cluster sampling
- Advantages of cluster sampling
- Convenient for geographically dispersed
populations (see Area Probability Sampling) - Reduced travel costs to contact sample elements
- Fact that a sampling frame of elements is not
available prohibits use of other methods - Disadvantages of cluster sampling
- Reduced efficiency in representing the population
when the cluster elements are similar - Existence of two or more stages creates increased
sampling error
34Multistage cluster sampling example
Problem - Select a probability sample of 60
V.C.U. students using a cluster sampling method.
35(No Transcript)
36STEP 3 Obtain enrollment lists for the selected
clusters (classes). Randomly select a sample of
students (elements) from each of the selected
clusters (classes)
37Area probability sample Example GSS sample of
American adults
- There is no list of American adults, so instead
we use multistage cluster sampling techniques and
list all counties in the U.S. - From the list of counties, we select a subsample
of counties. - From this subsample, we list and sample smaller
and smaller areas, until we reach the block
level. - At the block level, we send our interviewers to
random households. - Graphic illustration
38Are probability samples always perfectly
representative?
- NO!
- Sampling error - Entirely by chance, you may draw
a probability sample whose statistics do not
accurately represent the parameters of the
population. - If the technique has been properly executed and
the sample is large (e.g., 1,500-3,000 in GSS),
sampling error is likely to be small. - Probability samples can be used to estimate the
range within which population parameters should
fall (confidence interval) and the likelihood
that the parameter will fall within this range
(confidence level).
39Recall advantages of probability sampling
- Probability samples avoid conscious or
unconscious sampling bias on the part of the
researchers. - Probability samples permit quantitative estimates
of the degree to which a sample is likely to
represent the population.
40Other sources of error
- Even if a probability sample is unlikely to be
subject to substantial sampling error, sampling
error is not the only source of error in
research. - Some nonsampling errors in a survey
- Missing data, data entry errors, analysis errors
- Unclear definitions of concepts, poor question
construction, poor index construction - Interviewer errors, errors in coding open-ended
questions
41Final notes
- Care must be taken in all phases of the research
project to minimize both sampling and nonsampling
errors. - If possible to use, probability samples are
preferable to nonprobability samples.
42Sampling as practiced by experts
- For example
- Mathematica Policy Research National Opinion
Research Center (GSS). - However, the basic techniques are the same.
43Remember....
We're always sampling in life. The point is to
be careful and systematic about it!