Title: Sampling and Sample Sizes
1Sampling and Sample Sizes
- Dr. John T. Drea
- Professor of Marketing
- Western Illinois University
2Some Basic Sampling Terminology
- Population any complete group that shares a
common set of characteristics. - Sample a subset of a larger population
- Population Element an individual member of a
population. - Incidence the of the population that qualifies
for inclusion in the sample. - Incidence has a direct bearing on cost
- Census an investigation of all elements of a
population - a total enumeration rather than a
sample. - Additional source Zikmund (1999), Essentials of
Marketing Research
3Procedure for Drawing a Sample
Define the population
Identify the sampling frame
Select a sampling procedure
Determine the sample size
Select the sample elements
Collect data from designated elements
4Define the population
- Who is the population for each project?
- Is it everyone in a radius of Milwaukee, everyone
who attends Brewer games, or everyone who
purchased the Brewers Bonus Package? - Remember, the population is the group you want to
infer to from the sample - define it carefully so
it is clear who is in, and who is out.
5Identify the sampling frame
- Sampling frame the list of elements from which
the sample may be drawn. - It is sometimes referred to as the working
population. - e.g., to sample physicians, my sampling frame
might be a mailing list from the American Medical
Association. - Sample frame error occurs when certain pop.
elements are skipped or over-represented in the
sample frame. - Sample unit the element or group of elements
selected for the sample (ex if we select every
fifth train between Chicago and Milwaukee for a
survey, the fifth train would be the sample unit)
6Select a sampling procedure
- Random sampling error
- the difference between the result of a given
sample and the result of a census conducted using
identical procedures. - It is a statistical fluctuation due to chance
variations in the elements selected for a sample. - It is typically a function of sample size.
7Select a sampling procedure
- Systematic (nonsampling) error
- Error resulting from factors not due to chance
fluctuations - Includes the nature of a studys design,
imperfections in execution. - Ex highly educated people are more likely to
fill out a mail survey than poorly educated ones. - Ex doing a mall intercept at a grocery store on
Saturday morning may underrepresent seniors.
8Select a sampling procedure Probability and
Nonprobability Samples
- Probability sample every member of the
population has a known, non-zero chance of being
included in the sample. - Nonprobability sample the probability of any
particular member of a population being chosen is
unknown.
There are no appropriate statistical techniques
for measuring random sampling error from a
nonprobability sample. Projecting the data
beyond the sample is statistically inappropriate.
9Selecting a sampling procedure Nonprobability
samples
- Convenience sample obtaining those people/units
that are most conveniently available (ex college
students) - Judgment sample selected by an experienced
researcher based on judgment about appropriate
characteristics of the sample members. - Quota sample ensures that various subgroups of
the population will be represented (ex setting a
quota of 50 people from Milewaukee and 50 from
outside Milwaukee) - Quota samples have a tendency to produce people
that can be easily found.
10Selecting a sampling procedure Probability
samples
- Simple random sample a procedure that assures
that each element in the population of an equal
chance of being included in the sample. - Systematic sample a starting point is selected
at random and then every nth number on the list
is selected. - Need to be careful of periodicity (ex collecting
retail information every 7th day, or once per
month) - Multistage area sample involves a combination of
two or more probability sampling techniques. - Ex randomly choosing counties within a state,
then randomly choosing census blocks within each
county, then interviewing everyone in that block.
11Basic statistics underlying sample size
determination
- Central Limit Theorem As sample size, n,
increases, the distribution of the mean, X, of a
random sample taken from practically any
population approaches a normal distribution. - Sampling distribution If you took repeated
samples from the same population, the sampling
distribution is the distribution of these sample
means. - Standard error of the mean is a measure of the
standard deviation of the sampling distribution -
it is the standard deviation of the population
divided by the square root of the sample size.
12Estimating sample size
- To estimate a sample size, a researcher must
- estimate the standard deviation of the population
(a good rule of thumb is 1/6th of the range) - make a judgment about allowable amounts of error
- determine a confidence interval
- Once these are known, the formula for calculating
sample size is
where... Z standardized value that corresponds
to the confidence level S sample standard
deviation E acceptable magnitude of error
Z2S2 E2
n
13Estimating sample size
- Suppose a researcher studying annual expenditures
on lipstick wishes to have a 95 confidence
interval (Z1.96) and a range of error (E) of
less than 2, and an estimate of the standard
deviation is 29.
(1.962292)/22 808
If we change the range of acceptable error to 4,
sample size falls
n 202
Source Zikmund (1999), Essentials of Marketing
Research
14Estimating sample size
- Suppose you wanted to estimate the same size for
a survey which contains the following question - What is your overall attitude towards Miller
Park? - Very Good 7 6 5 4 3 2 1 Very Poor
- The range of acceptable error is 0.1 points, the
confidence level is 95, and the estimated
standard deviation is 1/6 of the range.
(1.96212)/0.12 a sample size of 384
If you increase the acceptable error to 0.2, the
sample size drops to n 96!
15Sample size determination when a proportion is
present
Sp estimate of the std. error of the
proportion p proportion of successes q (1 -
p), or the proportion of failures
pq n
Sp
Suppose that 20 of a sample of 1,200 recall
seeing an ad.
(0.2)(0.8) 1200
Thus, the population proportion who see the ad is
between 17.8 and 22.2, w/ 95 confidence.
0.0115
Sp
Confidence interval p ZclSp .2
(1.96)(0.0115) .2 .022
16Sample size determination when a proportion is
present (cont.)
To determine the sample size for a proportion, we
need to know or estimate the following Z2cl
square of the confidence level in standard error
units (i.e., typically 1.962, or
3.8416) p estimated proportions of successes q
(1 - p), or the proportion of failures E2
square of the maximum allowance for error
We insert this information into the following
formula
17Sample size determination when a proportion is
present (cont.)
Example We estimate that 60 of respondents will
describe the Brewer Bonus Package as positive,
with a confidence level of 95, and the
allowable error is 4.
(1.96)2(.6)(.4) 0.042
(3.8416)(0.24)/0.0016 n 576
If we assume a 70/30 split and if we increase the
maximum allowable error to 5, what would be n?
18Overall Estimating sample size when a
proportion is the characteristic of interest
- When the split is hypothesized to be 70/30 (95
CI) - 1 7,939
- 2 2,009
- 3 895
- 5 322
- When the split is hypothesized to be 85/15 (95
CI) - 1 4,850
- 2 1,222
- 3 544
- 5 306
Small population sizes typically require a
slightly smaller sample size If population
10,000, the 70/30 split sample sizes would be
14,465 21,678 3823 and 5313