Title: Sampling
1Sampling
2Observational vs. Experiment
- Observational observe and measure but do not
influence - Experimental impose treatments and measure
responses
3Other ways to produce data
- Simulation when logistically difficult or
ethically questionable to obtain a sample or run
an experiment - http//www.whatifsports.com/beyondtheboxscore/defa
ult.asp?article2009WorldSeriesGT139002
4Pollsters
- www.gallup.com
- http//blog.nielsen.com/nielsenwire/
- www.bls.gov/cps/
- www.norc.org/projects/gensoc.asp
5Statistical Inference
- Ability to answer specific questions based on
data with a known degree of confidence. - Careful design of data production is the most
important prerequisite for valid inference
6Population vs. Sample
- Population the entire group of individuals that
we want information about - Sample a part of the population that we study
in order to gather information about the whole
7Parameter vs. Statistic
- Parameter a measure from the entire population
- Statistic a measure calculated from a sample or
experiment
8Sampling vs. Census
- Sampling study a part
- Census attempts to study every individual in
the population
9Bias
- Systematic favoring of certain outcomes
10Sources of Bias
- Voluntary Response respondents who choose
themselves those with strong opinions,
particularly negative, are most likely to respond
11Sources of Bias
- Convenience Sampling individuals who are
easiest to reach are the only ones included in a
sample
12Probability Sample
- A sample chosen by chance
- The use of chance to select the sample is the
essential principle of statistical sampling
13Simple Random Sample
- A sample of size n individuals chosen so that
every possible set of n individuals has an equal
chance of being selected
14Stratified Random Sample
- Divides the population into groups of similar
individuals then chooses a separate SRS from each
group
15Stratified Random Sampling
- Divide the population into homogeneous groups
called strata.
16Stratified Random Sampling
- Then choose a separate SRS in each stratum and
combine these SRS to form the full sample.
17Stratified Random Sampling Example
- Radio stations owe royalties to composers when
they play their music.
18Stratified Random Sampling Example
- ASCAP collects royalties for its members
according to licensing agreements with the
stations.
19Stratified Random Sampling Example
- How does ASCAP distribute the 435 million a year
it collects to the composers of the 4 millions
songs in its library?
20Stratified Random Sampling Example
- Radio stations are stratified by type of
community, geographic location, and the size of
the license fee paid to ASCAP. - In all there are 432 strata.
21Stratified Random Sampling Example
- Tapes are made at random hours for randomly
selected stations in each stratum.
22Stratified Random Sampling Example
- The tapes are reviewed by experts who can
identify almost every piece of music ever written
and the composers are paid according to their
popularity.
23Stratified Random Sampling Example
- ASCAP samples 60,000 hours out of the 53 million
hours of radio programs each year.
24Cautions about sample surveys
- Undercoverage some groups in the population are
left out - Nonresponse an individual chosen for the sample
cant or wont cooperate
25More Sources of Bias
- Response Bias
- Respondents may lie
- Respondents may not remember accurately
- Interviewer may favor certain responses
- Race or sex of interviewer may affect response
- Wording of questions confusing or leading
26Wording of Questions
- It is estimated that disposable diapers account
for less than 2 of the trash in todays
landfills. In contrast, beverage containers,
third-class mail and yard wastes are estimated to
account for 21 of the trash in landfills. Given
this, in your opinion, would it be fair to ban
disposable diapers?
27Grade Inflation Article
- Students self-reported GPA
- Overstated or Understated?
28Sampling Error
- Larger random samples give more accurate results
than smaller samples
29Cluster Sampling
- Divides the population into groups, or clusters.
Some of these clusters are randomly selected.
30Cluster Sampling Example
- What do AP Statistics students think?
31Cluster Sampling Example
-
- Enough time on free-response section?
32Cluster Sampling Example
- List of all schools offering AP Stat.
33Cluster Sampling Example
- A number of schools are randomly selected and
EVERY student in that school is asked if they
felt they had enough time on the free-response
questions on the AP Stat test.
34Multistage Sampling Design
- Random sample chosen in stages
- Common for U. S. household surveys
- See TB p. 340, Example 5.7 What Do AP Students
Think? for more details
35Sampling Frame
- The list of individuals from which a sample is
actually selected - Ideally every individual in the population
36Choosing SRS Using Random Digits
- TB p. 336 Example 5.5 Joans Accounting Firm
37Closing
- Describe the relationship between a sample and a
population. - Describe the relationship between a statistic and
a parameter