INTRODUCTION TO STATISTICS - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

INTRODUCTION TO STATISTICS

Description:

... thinking will one day be as necessary for efficient citizenship as the ability ... they had little or no interest in polls about political issues of the day. ... – PowerPoint PPT presentation

Number of Views:319
Avg rating:3.0/5.0
Slides: 46
Provided by: wild6
Category:

less

Transcript and Presenter's Notes

Title: INTRODUCTION TO STATISTICS


1
INTRODUCTION TO STATISTICS
2
Overview
There are three kinds of lies lies, damned
lies, and statistics. Benjamin Disraeli
3
What is Statistics?

4
Why do we need statistics?

5
Who uses statistics?

6
Reasons to Study Statistics
  • Being an Informed Information Consumer
  • Understanding and Making Decisions
  • Evaluating Decisions That Affect Your Life

Statistics The Exploration and analysis of Data,
4th ed. Devore/Peck
7
To Make Decisions, you must be able to do the
following
  • Decide whether existing information is adequate
    or whether additional information is required.
  • If necessary, collect more information in a
    reasonable and thoughtful way.
  • Summarize the available data in a useful and
    informative manner.
  • Analyze the available data.
  • Draw conclusions, make decisions, and assess the
    risk of an incorrect decision.

Statistics The Exploration and analysis of Data,
4th ed. Devore/Peck
8
The Official Definition
  • Statistics is a collection of methods for
    planning studies and experiments, obtaining data,
    and then organizing, summarizing, presenting,
    analyzing, interpreting, and drawing conclusions
    based on the data.

9
Types of Data
10
Some Definitions
  • Data are observations (such as measurements,
    genders, survey responses) that have been
    collected.
  • A population is the complete collection of all
    elements to be studied. The collection is
    complete in the sense that it includes all
    subjects to be studied.
  • A sample is a subcollection of members selected
    from a population.
  • A census is the collection of data from every
    member of the population.

11
More Definitions
  • A parameter is a numerical measurement describing
    some characteristic of a population.
  • A statistic is a numerical measurement describing
    some characteristic of a sample.

12
Still More Definitions
  • Quantitative data consists of numbers
    representing counts or measurements.
  • Qualitative (or categorical or attribute) data
    can be separated into different categories that
    are distinguished by some nonnumeric
    characteristic.

13
Even More Definitions
  • Discrete data result when the number of possible
    values is either a finite number or a countable
    number.
  • Continuous (numerical) data result from
    infinitely many possible values that correspond
    to some continuous scale that covers a range of
    values without gaps, interruptions or jumps.

14
Levels of Measurement
  • The nominal level of measurement is characterized
    by data that consist of names, labels, or
    categories only. The data cannot be arranged in
    an ordering scheme.
  • Data are at the ordinal level of measurement if
    they can be arranged in some order, but
    differences between data values either cannot be
    determined or are meaningless.

15
Levels of Measurement
  • The interval level of measurement is like the
    ordinal level, with the additional property that
    the difference between any two data values is
    meaningful. However, data at this level do not
    have a natural zero starting point (where none of
    the quantity is present).

16
Levels of Measurement
  • The ratio level of measurement is the interval
    level with the additional property that there is
    also a natural zero starting point (where zero
    indicates that none of the quantity is present).
    For values at this level, differences and ratios
    are both meaningful.

17
Example
  • For the study, identify the following
  • What is the population? Sample?
  • Is the data quantitative or qualitative?
  • Is the data discrete or continuous?
  • Identify the level of measurement.

18
Critical Thinking
Statistical thinking will one day be as
necessary for efficient citizenship as the
ability to read and write. H. G. Wells
19
Statistical Thinking
  • Data beats anecdotes
  • Beware the lurking variable
  • Where the data come from is important
  • Variation is everywhere
  • Conclusions are not certain

The Basic Practice of Statistics, 3rd ed. Moore
20
Sampling
  • Sample data must be collected in an appropriate
    way, such as through a process of random
    selection.
  • If sample data are not collected in an
    appropriate way, the data may be so completely
    useless that no amount of statistical torturing
    can salvage them.

21
Non-response
  • How many people are actually called in order to
    get a sample of 1000 people? The Pew Research
    Center of the People and Press conducted a study
    on non-response and found
  • 938 Households never screened (no answer, busy,
    answering machine, not available, callback)
  • 678 Households that refused
  • 221 Households with no eligible person
    (Language barrier, health problem, no person 18
    or older)
  • 42 Households with eligible person (Incomplete
    interviews)
  • 1000 Households with eligible person (Complete
    interview)
  • So, Pew had to call 2879 residential phone
    numbers to get a sample of 1000 people.

22
Another Definition
  • A voluntary response sample (or self-selected
    sample) is one in which the respondents
    themselves decide whether to be included.

23
Watch out for . . .
  • Small Samples
  • Graphs
  • Pictographs
  • Percentages
  • Loaded Questions
  • Order of Questions
  • Nonresponse

24
Watch out for . . .
  • Missing Data
  • Correlation and Causality
  • Self-Interest Study
  • Precise Numbers
  • Partial Pictures
  • Deliberate Distortions

25
Evaluating a Research Study
  • What were the researchers trying to learn? What
    questions motivated their research?
  • Was relevant information collected? Were the
    right things measured?
  • Was the data collected in a sensible way?
  • Was the data summarized in an appropriate way?
  • Was an appropriate method of analysis selected,
    given the type of data and how the data was
    collected?
  • Are the conclusions drawn by the researchers
    supported by the data analysis?

Statistics The Exploration and analysis of Data,
4th ed. Devore/Peck
26
Design of Experiments
27
Planning and Conducting a Study
  • Understand the Nature of the Problem
  • Decide What to Measure and How to Measure It
  • Data Collection
  • Data Summarization and Preliminary Analysis
  • Formal Data Analysis
  • Interpretation of Results

Statistics The Exploration and analysis of Data,
4th ed. Devore/Peck
28
Types of Studies
  • In an observational study, we observe and measure
    specific characteristics, but we dont attempt to
    modify the subjects being studied.
  • In an experiment, we apply some treatment and
    then proceed to observe its effects on the
    subjects. (Subjects in experiments are called
    experimental units.)

29
Types of Observational Studies
  • In a cross-sectional study, data are observed,
    measured, and collected at one point in time.
  • In a retrospective (or case-control) study, data
    are collected from the past by going back in time
    (through examination of records, interviews, and
    so on).
  • In a prospective (or longitudinal or cohort)
    study, data are collected in the future from
    groups sharing common factors (called cohorts).

30
One More Definition
  • Confounding occurs in an experiment when you are
    not able to distinguish among the effects of
    different factors.

31
Example
  • Some studies have suggested that drinking wine
    rather than beer or other alcohol has added
    health benefits.
  • Wine drinkers eat less fried food, more
    vegetables and fruit.
  • They are less likely to smoke.
  • As a group, they are better educated and
    wealthier than the groups that consume beer or
    other alcohol.
  • These results may be the result of confounding
    by dietary habits and other lifestyle factors.

32
Important Factors to Consider in Designing an
Experiment
  • Control the effects of variables.
  • Use replication.
  • Use randomization.

33
Controlling Effects of Variables
  • Blinding is a technique in which the subject
    doesnt know whether he or she is receiving a
    treatment or a placebo. An experiment is
    double-blind if both the subjects and the
    administers of the treatment dont know whether a
    subject has received a treatment or a placebo.
  • The placebo effect occurs when an untreated
    subject reports an improvement in symptoms.

34
Controlling Effects of Variables
  • A block is a group of subjects that are similar,
    but blocks are different in the ways that might
    affect the outcome of the experiment.
  • Randomized Block design If conducting an
    experiment of testing one or more different
    treatments, and there are different groups of
    similar subjects, but the groups are different in
    ways that are likely to affect the response to
    treatments, use this experimental design
  • Form blocks (or groups) of subjects with similar
    characteristics.
  • Randomly assign treatments to the subjects within
    each block.

35
Controlling Effects of Variables
  • With a completely randomized experimental design,
    subjects are assigned to different treatment
    groups through a process of random selection.
  • With rigorously controlled design, subjects are
    very carefully chosen so that those given each
    treatment are similar in the ways that that are
    important to the experiment.

36
Replication and Sample Size
  • Repetition of an experiment on sufficiently large
    groups of subjects is called replication.
  • Use a sample size that is large enough so that we
    can see the true nature of any effects and obtain
    the sample using an appropriate method, such as
    one based on randomness.

37
Randomization and Other Sampling Strategies
  • In a random sample members from the population
    are selected in such a way that each individual
    member has an equal chance of being selected.
  • A simple random sample of n subjects is selected
    in such a way that every possible sample of the
    same size n has the same chance of being chosen.
  • A probability sample involves selecting members
    from a population in such a way that each member
    has a known (but not necessarily the same) chance
    of being selected.

38
Randomization and Other Sampling Strategies
  • With convenience sampling, we simply use results
    that are very easy to get.
  • In systematic sampling, we select some starting
    point and then select every kth element in the
    population.

39
Randomization and Other Sampling Strategies
  • With stratified sampling, we subdivide the
    population into at least two different subgroups
    (or strata) so that subjects within the same
    subgroup share the same characteristics, then we
    draw a sample from each subgroup (or stratum).
  • In cluster sampling, we first divide the
    population area into sections (or clusters), then
    randomly select some of those clusters, and then
    choose all the members from those selected
    clusters.

40
Randomization and Other Sampling Strategies
  • A multistage sample design involves the selection
    of a sample in different stages that might use
    different methods of sampling.

41
Sampling Errors
  • A sampling error is the difference between a
    sample result and the true population result
    such an error results from chance sample
    fluctuations.
  • A nonsampling error occurs when the sample data
    are incorrectly collected, recorded, or analyzed.

42
Should Polls Be Banned?
  • What is your interest in polls about campaigns
    and elections?
  • Very interested,
  • Somewhat interested
  • Little interest
  • No interest
  • What is your interest in polls which measure how
    Americans feel about the major political issues
    of the day, including those on which Congress is
    debating and voting.
  • Very interested
  • Somewhat interested
  • Little interest
  • No interest
  • Do you favor banning the publication of the
    polling results prior to an election?
  • Yes
  • No

43
Should Polls Be Banned? Results
44
Our Interest in Polls
  • 76 of Americans were interested in polls about
    campaigns and elections.
  • 23 said they had little or no interest in polls
    about campaigns and elections.
  • 77 of Americans were interested in the results
    of polls which measure how Americans feel about
    the major political issues of the day, including
    those on which Congress is debating and voting.
  • 22 said they had little or no interest in polls
    about political issues of the day.

45
Compare Results
  • How do our results compare?
  • What can account for any differences?
Write a Comment
User Comments (0)
About PowerShow.com