Title: Another poll ruins election suspense
1Another poll ruins election suspense.
- On January 22, 2006 the market research firm SES
surveyed 1,200 decided voters in the upcoming
Federal election. These were the results of the
poll - Conservative 36.4
- Liberal 30.1
- NDP 17.4
- Bloc Quebecois 10.6
- Green/Other 5.6
- The company claimed that these estimates were
accurate to within /- 3 percentage points 19
times out of 20.
2The next day (January 23, 2006) several million
Canadian voters cast their votes as follows.
- Conservative 36.3
- Liberal 30.2
- NDP 17.5
- Bloc Quebecois 10.5
- Green/Other 5.5
- Q. How could SES Research, with a tiny sample of
1200 predict with amazing accuracy the voting
behavior of several million Canadiansand ruin
much of the suspense on election night? - A. With careful sampling techniques!!!
3Aside from careful sampling the only alternative
is to collect data from an entire population!
- The federal government does this every 5 years
when they conduct a national census. - Except for countries with small populations,
routinely collecting data from populations is not
feasible because of the high financial costs and
time delays.
4Advantages of Sampling
- You get the data (and can do the data analysis)
fast (sample data can be collected and analyzed
quicklyoften overnight or in real time)! - Minimizes respondent fatigue (wear tear on the
object or subject being measured). - Less reactive, intrusive and socially disruptive.
5A short and painless history of sampling
- Sampling has developed in step with political
polling elections allow researchers to test
sampling designs. - The infamous Literary Digest Presidential poll (a
huge sample that was embarrassingly wrong in its
Presidential prediction). Their sampling frame
was compiled from subscription list, auto
registration lists, telephone directories. - Gallup from quota to probability sampling.
- Probability Sampling A sample will be
representative of the population if all
population members have a known, non-zero chance
of being selected.
6- If everyone in a population were identical any
kind or size of sample would be good enough to
make generalizations about the larger population. - But in reality, even people who belong to close
knit groups (e.g., families, friends, peers) vary
enormously! - So it takes careful planning to ensure that our
sample is representative of the larger population
of interest.
7(No Transcript)
8(No Transcript)
9- In addition to obvious problems with sampling
people who are convenient to the researcher, a
researchers personal biases (conscious
unconscious) may affect selection in ways that
make the sample unrepresentative of the
population. - Sampling bias means that those selected into the
sample are NOT typical or representative of the
larger population from which they have been
chosenthe population that the researcher wants
to generalize his or her results to!
10Basic Principles in Sampling.
- A sample is representative if sample
characteristics resemble population
characteristics. - This requires ALL members of the population to
have a known, non-zero chance of being selected
into the sample. - EPSEM samples ensure that all members of the
population have an equal-chance selection. - Probability samples are always more
representative than non-probability samples and
allow researchers to estimate the margin of error
(e.g. / - 3 percentage points 19 times out of
20).
11Concepts in Sampling.
- Element The sampling unit about which
information is to be collected analyzed (your
unit of analysis). - Population The set of all elements that exist
at the time of a given study (spatially
temporally defined). - Sampling Unit An entity of set of entities that
are considered for selection at some stage of the
design. In single sampling designs the unit
element are identical. In complex designs, there
can be different units (e.g., cities, blocks,
households, adults in households). - Sampling Frame The list(s) of sampling units
from which a sample is to be selected.
12- The distinguishing feature of probability samples
is that the researcher can specify for each
sampling unit the probability that it will be
included in the sample. This is not true for
non-probability samples. - With nonprobability sampling, large groups in a
population may have no chance of being selected
in the sample. - Studies repeated on a given population that use
probability samples should generate similar
estimates (e.g., polls prior to elections).
13Simple Random Sampling (SRS)
- Simple random sampling (SRS) is the basic
probability design and is incorporated at some
stage in ALL probability sampling designs. - Each unit has an n / N chance or probability
of being selected into the sample.where n
the size of the sample and N the size of the
population. (e.g., n 500 N 300,000
chance of being selected is 500 / 300,000
.0016) - With SRS, you need an accurate and complete
sampling frame.each element in the population of
interest is listed once and only once!
14Summary of Probability Sample Designs.
- Simple Random Sampling Assign a unique number
to each sampling unit select sampling unit
numbers using a random number table or generator. - Systematic Random Sampling Determine the
sampling interval select the first unit
randomly, select remaining units using interval
increments. - Stratified Random Sampling Determine strata
select from each stratum a random sample
proportionate (or disproportionate) to the size
of the stratum in the population of interest. - Multi-stage Area Sampling Determine the number
of levels or areas and from each level or area
select randomly.
15Systematic Sampling.
- The researcher selects every k element from the
sampling frame after a random start. (e.g., you
want to select a sample of 100 persons from a
population of 10,000. After a random start
between 1 and 100, you will select every one
hundredth individual..(k N / n 10,000 / 100
100).where k is the sampling interval N
is the size of the population and n is the
size of the sample. - If your random starting number was 14, you will
pick the 14th person on your list, followed by
the 114th person, followed by the 214th person,
and so on until you have drawn your sample of 100
people.
16- Systematic sampling is usually more efficient
than simple random sampling. - Avoid systematic sampling is your sampling frame
has a cyclical or periodic pattern in it (e.g.,
Months of the year are associated with cycles in
temperature).
17Stratified Sampling.
- Stratified sampling can improve the
representativeness of our sample by ensuring
that different groups in the population are
adequately represented in the sample. - You draw a specified number / percentage of
elements from subgroups in the population known
as strata. - Common strata age groups, education levels,
ethnic groups, language groups, political
affiliation, gender, occupational groups. - Randomly select your sample within strata.
18More on Stratified Sampling..
- For example, the student population of a college
is 1000. Of these, 700 (70) are from Ontario,
200 (20) are from other provinces, and 100
(10)are from outside Canada. - With stratified sampling we could ensure that in
a sample of 100 students, we obtain 70 (70) from
Ontario, 20 (20) from other provinces, and 10
(10) from outside Canada by randomly selecting
within these three strata. - This is known as proportionate stratified
sampling.
19- Stratified Sampling typically requires more work.
- Not only do you need a complete sampling frame,
but you need accurate information on the variable
you intend to stratify on (e.g., gender, academic
major, religious affiliation or respondents,
etc.,) - Common information sources Census, Survey
estimates, Official records, Telephone screening.
20And even more on Stratified Sampling.
- For example, if a population of executives in a
major company were 10,000, and 7000 were male and
3000 were femalewe could divide the population
into two strata listing all male and female
executives. - And then use proportionate or disproportionate
stratified sampling to construct our sample.
With proportionate stratified sampling our sample
would be 70 male and 30 female with
disproportionate stratified sampling our sample
could be 50 male and 50 female.
21Multi-stage or cluster sampling.
- This design assumes that any population can be
regarded as comprising a hierarchy of sampling
units (e.g., a university can be broken down into
faculties, departments, sections and classes
Canada can be broken down into provinces,
counties or regions, cities, city blocks, and
households). - With cluster sampling we randomly sample down the
hierarchy of sampling units in a population of
interest.
22More on multi-stage or cluster sampling.
- Compile a list of all cities in Ontario and
randomly select cities from this list. - Within each of the selected cities, compile a
list of all residential city blocks and randomly
select a number of residential city blocks. - Within each of the selected residential city
blocks, compile a list of all households and
randomly select our sample of households.
23And more on multi-stage or cluster sampling.
- Cluster designs can save you a lot of money!
- Cluster designs are especially useful when it is
difficult to put together an adequate sampling
frame. - The mechanics of cluster sampling are
straightforward. - On the downside, the selection of clusters
depends on the goals of the study, population
distribution, and elements to be studied. - Generally produces the least representative of
the probability designs..and hence the least
accurate sample estimates.
24Most major survey organizations employ the
following multi-stage sampling design
- Divide the entire geographic residential area of
the U.S. or Canada into a 1000-2000 numbered grid
of primary sampling units. - Randomly select 100-200 primary sampling units at
stage 1. - Randomly select from a numbered grid of sampling
places within each PSU at stage 2. - Randomly select from a numbered grid of sampling
segments within each sampling place at stage 3. - Randomly select a dwelling within each sampling
segment at stage 4.
25Non-Probability Sample Designs.
- Chance of population element being selected into
sample is unknownas is the representativeness. - Cheaper, easier to implement designs..well-suited
to preliminary studies of small, hard-to-find
populations. - Purposive sampling sample is drawn based on the
experience judgement of researcher(s). - Quota sampling a non-probability analogue to
stratified samplingresearcher selects a quota of
respondents for the sample in a non-random way. - Convenience sampling accidental samplingit is
anything but random.avoid accidents. - Snowball sampling initial members of the sample
are used as informants to find other elements.