Statistics Overview - PowerPoint PPT Presentation

About This Presentation
Title:

Statistics Overview

Description:

Statistics Overview Some New, Some Old Some to come Science of Statistics Descriptive Statistics methods of summarizing or describing a set of data tables ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 16
Provided by: ukyEdukd6
Learn more at: https://www.uky.edu
Category:

less

Transcript and Presenter's Notes

Title: Statistics Overview


1
Statistics Overview
  • Some New, Some Old Some to come

2
Science of Statistics
  • Descriptive Statistics methods of summarizing
    or describing a set of data tables, graphs,
    numerical summaries 
  • Inferential Statistics methods of making
    inference about a population based on the
    information in a sample

3
Levels of Measurement
  • Nominal The numerical values just "name" the
    attribute uniquely no ordering of the cases is
    implied.
  • Ordinal Attributes can be rank-ordered here,
    distances between attributes do not have any
    meaning.
  • Interval The distance between attributes does
    have meaning.
  • Ratio There is always an absolute zero that is
    meaningful this means that you can construct a
    meaningful ratio.
  • It's important to recognize that there is a
    hierarchy implied in the level of measurement
    idea. At each level up the hierarchy, the current
    level includes all of the qualities of the one
    below it and adds something new. In general, it
    is desirable to have a higher level of
    measurement.

4
Variables
  • Individuals are the objects described by a set of
    data may be people, animals or things
  • Variable is any characteristic of an individual
  • Categorical variable places an individual into
    one of several groups or categories
  • Quantitative variable takes numerical values for
    which arithmetic operations make sense
  • Distribution of a variable tells us what values
    it takes and how often it takes these values

5
Correlation
  • Correlation can be used to summarize the amount
    of linear association between two continuous
    variables x and y.
  • A positive association between the x and y
    variables is indicated by an increase in x
    accompanied by an increase in y.
  • A negative association is indicated by an
    increase in x accompanied by a decrease in y.
  • For more information see http//www.anu.edu.au/nce
    ph/surfstat/surfstat-home/1-4-2.html

6
Chi-square
  • A chi square statistic is used to investigate
    whether distributions of categorical variables
    differ from one another.
  • The chi square distribution, like the t
    distributions, form a family described by a
    single parameter, degrees of freedom.
  • df (r 1) X (c 1)
  • For a detailed example, see http//math.hws.edu/ja
    vamath/ryan/ChiSquare.html

7
Hypothesis Testing
  • Hypothesis testing in science is a lot like the
    criminal court system in the United States
    consider How do we decide guilt?
  • Assume innocence until proven'' guilty.
  • Proof has to be beyond a reasonable doubt.''
  • Two possible decisions guilty or not guilty
  • Jury cannot declare someone innocent

8
Statistical Hypotheses
  • Statistical Hypotheses are statements about
    population parameters.
  • Hypotheses are not necessarily true.
  • The hypothesis that we want to prove is called
    the alternative hypothesis, Ha.
  • Hypothesis formed which contradicts Ha is called
    the null hypothesis, Ho.
  • After taking the sample, we must either Reject
    Ho and believe Ha or Fail to Reject Ho because
    there was not sufficient evidence to reject it.

9
Type I and II Error
  • Consider the jury trial
  • If a person is really innocent, but the jury
    decides (s)he's guilty, then they've sent an
    innocent person to jail.
  • Type I error.
  • If a person is really guilty, but the jury finds
    him/her not guilty, a criminal is walking free on
    the streets.
  • Type II error.
  • In our criminal court system, a Type I error is
    considered more important than a Type II error,
    so we protect against a Type I error to the
    detriment of a Type II error. This is typically
    the same in statistics.

10
P-value
  • The choice of alpha is subjective.  
  • The smaller alpha is, the smaller the critical
    region. Thus, the harder it is to Reject Ho.  
  • The p-value of a hypothesis test is the smallest
    value of alpha such that Ho would have been
    rejected.
  • If P-value is less than or equal to alpha, reject
    Ho.
  • If P-value is greater than alpha, do not reject
    Ho.

11
Confidence Intervals
  • Statisticians prefer interval estimates.  
  • Point Estimate /- Critical Value Standard
    Error
  • The degree of certainty that we are correct is
    known as the level of confidence.
  • Common levels are 90, 95, and 99.
  • Increasing the level of confidence,
  • Decreases the probability of error
  • increases the critical point
  • widens the interval
  • Increasing n, decreases the width of the interval

12
Gamma
  • This is a statistics utilized in cross-tabulation
    tables.
  • Typically viewed as a nonparametric statistic.
  • The Gamma statistic is preferable to Spearman R
    or Kendall tau when the data contain many tied
    observations. Gamma is a probability
    specifically, it is computed as the difference
    between the probability that the rank ordering of
    the two variables agree minus the probability
    that they disagree, divided by 1 minus the
    probability of ties.
  • It is basically equivalent to Kendall tau, except
    that ties are explicitly taken into account.
  • Detailed discussions of the Gamma statistic can
    be found in Goodman and Kruskal (1954, 1959,
    1963, 1972), Siegel (1956), and Siegel and
    Castellan (1988).

13
Gamma
  • This statistic also tells us about the strength
    of a relationship.
  • Can be used with ordinal or higher level of data.
  • For a more detailed discussion of Lambda, Gamma
    and Tau, see http//72.14.209.104/search?qcache8
    ZS4_FvVqrgJms.cc.sunysb.edu/mlebo/_private/Class
    es/POL501/Lecture252012.pdfgammaANDlambdaAND
    tauANDstatisticshlenglusctclnkcd39

14
Considering Bias
  • A sample is expected to mirror the population
    from which it comes, however, there is no
    guarantee that any sample will be precisely
    representative of the population from which it
    comes. The difference between the sample and the
    population is referred to as bias.
  • Sampling BiasA tendency to favor selecting
    people that have a particular characteristic or
    set of characteristics. Sampling bias is usually
    the result of a poor sampling plan. The most
    notable is the bias of non response when people
    of specific characteristics have no chance of
    appearing in the sample.
  • Non-Sampling ErrorIn surveys of personal
    characteristics, unintended errors may result
    from
  • The manner in which the response is elicted
  • The social desirability of the persons surveyed
  • The purpose of the study
  • The personal biases of the interviewer or survey
    writer

15
Enjoy the exploration!
  • Questions or comments
Write a Comment
User Comments (0)
About PowerShow.com