Where do data come from and Why we dont always trust statisticians'

1 / 22
About This Presentation
Title:

Where do data come from and Why we dont always trust statisticians'

Description:

George Gallup surveyed 50,000 people chosen randomly. Comparison of forecasts: Gallup's Prediction for Roosevelt 56% Gallup's prediction of Digest 44 ... –

Number of Views:18
Avg rating:3.0/5.0
Slides: 23
Provided by: web27
Category:

less

Transcript and Presenter's Notes

Title: Where do data come from and Why we dont always trust statisticians'


1
Where do data come from and Why we dont
(always) trust statisticians.
2
Induction vs. Deduction the gist of statistics
  • Deduction What is true about the whole, must be
    true about a part.
  • Induction What is true about the part might be
    true about the whole.

3
Population vs. Sample
  • Population is the entire group of individuals
    about which we want information.
  • Sample is a part of population from which we
    actually collect information.
  • We use samples to study population because,
    often, populations are impossible or impractical
    to study.

4
Real Life Example of a Bad Sample
  • Ann Landers, a famous columnist, collected a
    sample of 10,000 people who wrote in to answer
    this question If you could do it all over
    again, would you have children?
  • 70 of the respondents said that they would not
    have children.
  • When a sample was selected at random, 91 of the
    people said that they would have children.

5
Potential problems with sample surveys
  • Undercoverage occurs when some groups in
    population are left out of the process of
    choosing the sample.
  • Nonresponse occurs when an individual chosen for
    the sample cannot be contacted or refuses to
    respond.

6
Another Real life Example of a Bad Sample
  • In 1936 Literary Digest mailed out 10,000,000
    ballots asking who the respondents are going to
    vote for A. Landon or F.D. Roosevelt.
  • 2,300,000 ballots were returned, predicting a
    strong win (57) for Landon.

7
Another Real life Example of a Bad Sample
  • George Gallup surveyed 50,000 people chosen
    randomly.
  • Comparison of forecasts
  • Gallups Prediction for Roosevelt 56
  • Gallups prediction of Digest 44
  • Digest prediction for Roosevelt 43
  • Actual vote 62
  • Literary Digest used their subscription list,
    phone directory, lists of car owners, club
    members.

8
(No Transcript)
9
Right and Wrong Ways to Sample
  • A simple random sample is a sample where (1) each
    unit of population has an equal chance of being
    chosen and (2) all units are chosen
    independently.
  • The sample is biased if at least one group of
    individuals has greater chances of being selected.

10
Example of a good sample
  • You want to study effects of computers on GPA.
    You dont have the resources to study all
    students.
  • To select a sample of students for the study you
  • Get a list of all students,
  • Select at random students on the list,
  • Collect information from the students selected,
  • Compare those who have computer with those who
    dont.

11
Example of a bad sample
  • You want to study effects of computers on GPA.
    You dont have the resources to study all
    students.
  • To select a sample of students for the study you
  • Use your friends.
  • Hang an ad in the computer lab.
  • Post an on-line questionnaire on WKU site.

12
Stratified Random Sample
  • When we know proportions of each group in the
    population Stratified random sample is better
    than SRS.
  • In stratified sample, number of people chosen
    from each group is proportional to the size of
    that group in the population.

13
Confounding
  • Two explanatory variables are confounded when
    their effects on the response variable cannot be
    distinguished from each other.
  • Confounding is often a problem with a study that
    uses sample surveys to collect data (even if
    sampling is done right).

14
Observation vs. Experiment
  • Observational study - observes individuals and
    measures variables but does not attempt to
    influence responses.
  • Experiment imposes treatment on individuals to
    observe their responses.

15
How to design an Experiment
  • The purpose of an experiment is to find out how
    one variable (response variable) changes in
    response to change in another variable
    (explanatory variable).
  • Experiment
  • Subject ?Treatment ?Response

16
Placebo Effect
  • Placebo effect change in behavior due to
    participation in experiment.
  • Placebo effect is a problem when experiment does
    not have a control group (a basis for
    comparison)
  • To avoid the problem design a randomized
    comparative experiment.

17
How to design a Randomized Comparative Experiment
  • Randomly split the subjects into two groups
  • control group receives no treatment
  • treatment group receives treatment
  • Compare the results.
  • Both will be equally affected by Placebo effect,
    so the difference between the groups shows
    whether the treatment works.

18
How to interpret results of an experiment
  • Observe outcomes for treatment and control
    groups.
  • If outcomes are different enough so that we can
    say that this difference would rarely occur by
    chance, we conclude that the difference is
    statistically significant.

19
Population vs. Sample
  • Population is the entire group of individuals
    about which we want information.
  • Sample is a part of population from which we
    actually collect information.
  • Based on the sample, we make conclusion about the
    whole population.

20
Parameter vs. Statistic
  • A Parameter is the number that describes the
    population.
  • A Statistic is a number that describes the
    sample.
  • We use statistics to estimate parameters.

21
Sampling Distribution
  • The result of your study is a statistic, which
    can vary from sample to sample
  • Sampling Distribution of a statistic is the
    distribution of values taken by the statistic in
    all possible samples of the same size from the
    same population
  • EstimateTrue Parameter Sampling Error

22
Bias and variability
  • A statistic is biased if the mean of the sampling
    distribution is not equal to the true value of
    the parameter being estimated.
  • Variability of a statistic is the spread of
    sampling distribution.
  • Bias does not go away with larger samples.
  • Variability goes away with larger samples.
Write a Comment
User Comments (0)
About PowerShow.com