Nonparametric Tests of Significance - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Nonparametric Tests of Significance

Description:

This is typically the case when we have categorical variables (nominal scale ... A patroller uses his radar gun on 100 passing cars and finds that 86 of them ... – PowerPoint PPT presentation

Number of Views:188
Avg rating:3.0/5.0
Slides: 32
Provided by: drrober7
Category:

less

Transcript and Presenter's Notes

Title: Nonparametric Tests of Significance


1
Nonparametric Tests of Significance
2
Nonparametric Tests
  • Nonparametric tests are tests that do not test
    hypotheses about population parameters.
  • We generally use nonparametric tests of
    significance under one of two conditions
  • 1) The variable we are measuring can not be
    tested with parametric tests.
  • 2) The underlying assumptions of parametric
    tests are not fulfilled. For example, the shape
    of the distribution is not normal (particularly
    if N is small).

3
  • This is typically the case when we have
    categorical variables (nominal scale data) or
    ordinally scaled variables.
  • Nominal scale data represent classes or
    categories. They have no quantitative properties
    (e.g., political party, religion, gender, numbers
    assigned to horses in a horse race).
  • Ordinal scale data measurements that represent
    position or order in a series (ranking political
    candidates on popularity, movie ratings, places
    runners finished in a race).

4
Categorical Variables
  • Sometimes nominal data can only be broken down
    into two categories.
  • E.g., right or wrong, male or female, pass or
    fail, heads or tails
  • The data are then referred to as binomial and we
    have a dichotomous population.
  • In such a case, we define p as the probability of
    obtaining one category and q as the probability
    of obtaining the other.

5
The Binomial Test
  • Note, there is a special relationship between p
    and q.
  • p q 1
  • So q 1 p
  • If p q 0.5, we can solve binomial problems
    using Binomial Table M on page 352.

6
An Example
  • E.g., Tom claims he has a coin fixed to turn up
    heads. He flips it 20 times and gets 16 heads.
    Is his claim correct? Use normal decision rules.
  • We will call p the probability of getting heads
    in one toss.
  • p 0.5
  • We will call q the probability of getting tails
    in one toss.
  • q 0.5
  • N 20 X 16 (score on the trials)

7
  • H0 p 0.5
  • H1 p gt 0.5
  • The left column of the Table M has values of N (5
    - 50).
  • There are four other columns. Two for one tailed
    test (? at 0.05 and 0.01) and two for a
    two-tailed test (? at 0.05 and 0.01).
  • The numbers in these columns represent X or N X
    (which ever is larger).

8
  • We will use X (16) since it is larger than N - X
    ( 4)
  • Go down the left column to where N 20.
  • Go across to the one-tailed column where ? is
    0.05. If your obtained X (or N X), is greater
    than or equal to the number in the Table, reject
    the H0.
  • The number listed is 15.
  • Since our Xobs 16 gt Xcrit 15, we will reject
    the H0. The coin does appear to be fixed (X obs
    16, p lt 0.05).

9
Normal Approximation to the Binomial
  • What if p q 0.5
  • If N is sufficiently large, we can treat the
    binomial distribution as though its a normal
    distribution.
  • If Np and Nq are both greater than 10, we can use
    normal approximation to the binomial
  • If p q 0.5, we can still use normal
    approximation to the binomial as long as N is at
    least 25.
  • In these cases, we can obtain a z-score.

10
  • Heres the formula

11
  • E.g., In psychology departments in Canada, ¾ of
    the students are female and only ¼ are male. A
    university administrator believes his university
    has a different percentage of males and females.
    A sample of 48 is randomly chosen and 14 are
    male. Is his claim correct? Use ? 0.05.
  • p 0.25 q 0.75 N 48 X 14
  • H0 p 0.25
  • H1 p 0.25

12
Np 48(0.25) 12 Nq 48(0.75) 36
We may use normal approximation to the binomial.
Remember, zcrit for a two tailed test for
alpha at 0.05, is 1.96.
Since Zobs 0.67 lt zcrit 1.96 we do not
reject the H0. The university does not have a
different percentage of males and females (Zobs
0.67, p gt 0.05).
13
Another Example
  • A police inquiry claims that 68 of all drivers
    speed on highways. A patroller uses his radar
    gun on 100 passing cars and finds that 86 of them
    were speeding. Is the inquiry accurate? Use ?
    0.01.
  • p 0.68 q 0.32 N 100 X 86
  • H0 p 0.68
  • H1 p 0.68

14
Np 100(0.68) 68 Nq 100(0.32) 32
We may use normal approximation to the binomial.
Zcrit 2.58
Since zobs 3.86 gt zcrit 2.58, reject
H0. The inquiry was not accurate (zobs 3.86, p
lt 0.01).
15
?2 One Variable Case
  • If p q 0.5, we can also use ?2.
  • Also referred to as a chi square or goodness of
    fit test.
  • This technique provides a test of whether a
    significant difference exists between observed
    number of cases and expected number of cases.

16
Example
  • E.g., A recent survey by the Center for Applied
    Psychological Testing suggests that 55 of all
    people are introverts, whereas 45 are
    extraverts. A skeptical psychologist randomly
    selects a sample of 93 subjects and gives them a
    personality test. Based on the test, he finds
    that 38 are extraverts and 55 are introverts. Is
    he right to be skeptical? Use normal decision
    rules.

17
  • Well make a table that compares expected
    frequencies (fe) to observed frequencies (fo).
  • H0 fo fe
  • H1 fo fe

18
  • Now we calculate ?2.

? 2 ? (fo - fe)2 fe
?2 (38 - 42)2 42
(55 - 51)2 51

? 2 0.38 0.31 0.69
We now compare this to a ?2crit in Table B on
page 328.
19
  • The ?2crit changes as the degrees of freedom
    change.
  • df k - 1 where k represents the number of
    categories.

df k - 1 2 - 1 1
? 2crit 3.841
Since ? 2 0.69 lt ? 2crit 3.841, do not
reject H0. The survey is accurate (?12 0.69,
p gt 0.05).
20
?2 One Variable Case
  • Importantly, ?2 can be used to analyze
    categorical data that is not binomial.
  • That is, it can be used to analyze data which has
    more than two outcomes.
  • E.g., A researcher wants to determine whether
    there is a difference among beer drinkers living
    in St. Johns in their preference for brands of
    light beer.

21
An Example
  • 150 beer drinkers taste three brands of light
    beer. Their preferences are provided below. Is
    there a difference in the preference of the
    brands? Use ? 0.01.
  • H0 fo fe
  • H1 fo fe

Brand A Brand B Brand C Total fo 45
40 65 150 fe
50 50 50
22
?2 ? (fo - fe)2 fe
?2 (45 - 50)2 50
(40 - 50)2 50
(65 - 50)2 50


0.5 2.00 4.5 7.00
df k - 1
3 - 1 2
Since ?2obs 7.00 lt ?2crit 9.210, do not
reject H0. There is no difference in preference
of the three brands (? 22 7.00, p gt 0.01).
23
?2 Test of Independence
  • So far, weve looked at cases in which we were
    investigating one categorical variable (e.g.,
    brand of beer, introvert/extravert).
  • However, we can also use ?2 to deal with cases
    with more than one categorical variable in the ?2
    test of independence.
  • When carrying out this test, we assume that the
    two variables are independent, i.e., one variable
    does not influence the other.

24
An Example
  • A bill in the United States has been proposed to
    lower the legal age for drinking to 18. A
    political scientist is interested in determining
    whether there is a relationship between political
    affiliation and opinions on this bill. He
    samples 200 registered Republicans and 200
    Democrats. Their opinions on the bill are
    presented below. Is there a difference in
    opinions between the Democrats and Republicans?
    Use ? 0.05.

25
  • H0 Opinions of the bill are independent of
    political affiliation.
  • H1 Opinions on the bill depends on political
    affiliation.

26
Opinion For Undecided
Against Republican 68 22
110 Democrat 92 18 90
Row Totals 200 200 400
Column 100 40 200 Totals
Our expected frequencies will be based on
our obtained frequencies.
27
(No Transcript)
28
fe (c) (row total)(column)/N
(200)(200)/400 100
fe (d) (row total)(column)/N
(200)(160)/400 80
fe (e) (row total)(column)/N
(200)(40)/400 20
fe (f) (row total)(column)/N
(200)(200)/400 100
29
Now we follow the same formula as before to
calculate ?2.
30
(68 - 80)2 (22 - 20)2 (110 - 100)2 80
20 100
(92 - 80)2 (18 - 20)2 (90 - 100)2 80
20 100
1.8 0.2 1.0 1.8 0.2 1.00 6.00
df (r - 1)(c - 1)
r rows c columns
(2 - 1)(3 - 1) 2
31
?2 crit 5.991
Since ?2 obt 6.00 gt ?2 crit 5.991, we
reject the H0. Opinion of the bill depends
on political affiliation (?22 6.00, p lt 0.05).
Write a Comment
User Comments (0)
About PowerShow.com