Bivariate Statistics - PowerPoint PPT Presentation

About This Presentation
Title:

Bivariate Statistics

Description:

Chi-Square Goodness of Fit Test. Kolmogorov-Smirnov Test. Differences Between Two Samples ... of the Chi-square distribution as the ... Goodness of fit test ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 18
Provided by: lax3
Category:

less

Transcript and Presenter's Notes

Title: Bivariate Statistics


1
Bivariate Statistics
  • GTECH 201
  • Lecture 17

2
Overview of Todays Topic
  • Two-Sample Difference of Means Test
  • Matched Pairs (Dependent Sample) Tests
  • Chi-Square Goodness of Fit Test
  • Kolmogorov-Smirnov Test

3
Differences Between Two Samples
  • Are there significant differences between the two
    samples?
  • If sample differences are significant, then
  • Were can infer that the samples were drawn from
    truly different populations or vise versa
  • Extending hypothesis testing
  • Statistic (mean)
  • Relationship between samples
  • Independent
  • Dependent

4
Two Sample Difference Of Means
  • Numerator
  • Actual difference between sample means
  • Denominator
  • Standard error of the difference of the means (a
    measure of expected sampling error)

5
Pooled Variance/Separate Variance
  • If the population variance is equal, use PV
  • If the population variances are unknown but
    assumed equal, then use modified formula
  • If population variances are assumed to be
    unequal, then, use SV
  • Sample variances are considered best estimators
    of population variances

6
Matched Pairs (Dependent Sample)
  • One set of observations (units)
  • Same location and/or same individuals
  • One variable, two time periods
  • Two variables, one time period
  • Absence of two independent samples
  • When two sets of data are collected for one group
    of observations,samples are
  • Dependent
  • Matched pairs difference test (is the appropriate
    inferential test)
  • Each unit in the sample has two values (a matched
    pair)
  • Parametric, non parametric

7
Wilcoxon Matched-pairs Signed Ranks Test
  • Random sample
  • Ordinal or downgraded to ordinal
  • H0 ranked matched-pair differences are equal
  • Test statistic with T rank sum

8
Matched Pairs t Test
  • Sample are independent of each other
  • In this situation, the t-test considers the
    difference between the values for each matched
    pair
  • The greater the difference (d), the more
    dissimilar the results of the two values within
    the matched pair
  • The mean ( ) is determined for the set of
    all matched pairs in the sample

9
Wilcoxon Rank Sum W
  • Non-parametric difference of means test
  • Measures magnitude of the differences in ranked
    positions

10
Goodness of Fit Tests
  • Comparing an actual or observed frequency
    distribution to some expected frequency
    distribution
  • Used to test the hypothesis that a a set of data
    has a particular frequency distribution
  • Confirm or deny the relevance or validity of a
    particular theory
  • Verify assumptions about samples

11
Chi Squared Distributions
  • The total area under a chi-squared curve is equal
    to 1
  • Chi-squared curve starts at 0 on the horizontal
    axis and extends indefinitely to the right,
    approaching but not touching the horizontal axis
  • Chi squared curve is right skewed
  • As the number of degrees of freedom become
    larger, chi-squared curves look increasingly like
    normal curves

12
c2 Function
  • The above animation shows the shape of the
    Chi-square distribution as the degrees of freedom
    increase (1, 2, 5, 10, 25 and 50 )

13
Goodness of Fit Tests
  • Characteristics of the expected frequency
    distribution
  • Uniform or equal
  • Proportional or unequal
  • Normal (theoretical)
  • The chi-squared statistic compares
  • Observed frequency counts of a single variable
    (organized into nominal or ordinal categories)
  • An expected distribution of frequency counts
    organized in the same categories

14
Rules for Using c2 Test
  • Samples must be taken at random
  • Variables must be organized in nominal or ordinal
    categories
  • Must use absolute frequency counts
  • Cannot be applied if the observations or sampling
    units are relative frequencies such as
    percentages, proportions, or rates
  • If there are 2 nominal/ordinal categories, then
  • both expected frequency counts must be at least
    five
  • If there are 3 or more nominal/ordinal
    categories, then
  • No expected frequency should be less than two
  • At the most, only one-fifth of the frequency
    counts can be less than five
  • This may be a reason to combine or reorganize
    categories

15
Test Statistic
Where, i 1, 2, to k (i.e., the different
categories) O is the observed frequency in a
particular category E is the expected frequency
in that same category k is the total number of
categories
16
Interpreting the value of c2
  • Null and Alternative hypothesis
  • Chi squared value is small, i.e., if the observed
    and expected frequencies are similar, then the
    goodness of fit is strong,
  • Do not reject the null hypothesis
  • Vice versa

17
Kolmogorov-Smirnov
  • Goodness of fit test
  • Uses data in ordinal categories, or
    interval/ratio data downgraded to ordinal
    categories
  • Population is continuously distributed
  • Null and Alternative hypothesis
  • Cumulative relative frequencies are compared with
    cumulative frequencies expected for a normal
    distribution
  • K-S test statistic (D) is the maximum absolute
    difference between two sets of cumulative values
Write a Comment
User Comments (0)
About PowerShow.com