13' NONPARAMETRIC STATISTIC

About This Presentation

Title:

13' NONPARAMETRIC STATISTIC

Description:

Machine 1. 1 2 3 4 5 6 7 8 9 10 11 12. Day. Table 13a. A) Sign Test for a Population Median ? ... H is very nearly a chi-square distribution with k-1 degrees of ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 38

Provided by: ila

Category:

more less

Transcript and Presenter's Notes

Title: 13' NONPARAMETRIC STATISTIC

1
13. NONPARAMETRIC STATISTIC

13.1 SINGLE POPULATION INTERFERENCE THE SIGN
TEST
13.2 THE MANN-WHITNEY U TEST
13.3 COMPARING TWO POPULATIONS THE WILCOXON RANK
SUM TEST FOR INDEPENDENT SAMPLE
13.4 COMPARING TWO POPULATION THE WILCOXON
SIGNED RANK TEST FOR THE PAIRED DIFFERENCE
EXPERIMENT
13.5 THE KRUSKAL-WALLIS H-TEST FOR A COMPLETELY
RANDOMIZED DESIGN
13.6 THE FRIEDMAN Fr-TEST FOR A RANDOMIZED BLOCK
DESIGN
13.7 SPEARMANRANK CORRELATION COEFFICIENT

2
13. NONPARAMETRIC STATISTIC

13.0.1 NONPARAMETRIC STATISTICAL METHODS
Statistical techniques for comparing two or more
populations that are based on an ordering of the
sample measurements according to their relative
magnitudes, which requires fewer or less
stringent assumptions concerning the nature of
the probability distributions of the populations.
13.0.2 NONPARAMETRIC TESTS
The counterparts of the t- and F-tests compare
the probability distributions of the sampled
populations rather than specific parameters of
these populations (such as the means and
variances).
Most nonparametric methods use the relative ranks
of the sample observations. These test
particularly valuable when it is unable to obtain
numerical measurements of the phenomena but are
able to rank them in comparison to each other.
Rank statistics- statistics based on ranks of
measurements.

3
13.1 SINGLE POPULATION INTERFERENCE THE SIGN
TEST

Relatively simple nonparametric procedure for
testing hypotheses about the central tendency of
a nonnormal probability distribution. Sign test
provides inferences about the population median ?
rather than the population means µ.
? is the 50th percentile of the distribution and
as such is less affected by the skewness of the
distribution and the presence of outliers
(extreme observations).

4
Table 13a

A simple nonparametric test in the case of paired
samples is provided by the sign test.
This test consist of taking the difference
between the numbers of defective bolts for each
day and writing only the sign of the difference,
e.g. for day 1 we have 47-71, which is negative.
From the table 1, we obtain the sequence of
signs
- - - - - - -
- -
(i.e. 3 pluses and 9 minuses). Its show that by
using a two tailed test of this distribution at
the 0.05 significance level, there is no
difference between the machine at this level.

5
A) Sign Test for a Population Median ?

ONE-TAILED TEST
H0 ? ?0
Ha ? gt ?0 or Ha ? lt ?0
Test statistic
S Number of sample
measurements greater
than ?0 or S number of
measurements less than ?0.

TWO TAILED TEST
H0 ? ?0
Ha ? ? ?0
S Larger of S1 and S2,
where S1 is the number of measurements less
than ?0 and S2 is the number of measurements
greater than ?0

6
Observation significant level
p-value P(x ? S) p-value 2P(x ? S)
where x has a binomial distribution with
parameters n and p 0.5 (Use Table
II, Appendix A) Rejection region
Reject H0 if p-value ? 0.05
Assumption The sample is selected
randomly from a continuous probability
distribution. Note No assumptions
need to be made about the shape of the
probability distribution.
7
B) Large-Sample Sign Test for a Population
Median ?

ONE-TAILED TEST
H0 ? ?0
Ha ? gt ?0 or Ha ? lt ?0
Test statistic z

TWO TAILED TEST
H0 ? ?0
Ha ? ? ?0

8
Note S is calculated as known in the previous
box. We subtract 0.5 from S as the correction
for continuity. The null hypothesized mean value
is np 0.5n, and the standard deviation is

Rejection region z gt z? Rejection region
z gt z?/2
where tabulated z values can be found inside the
front cover.

9
13.2 The Mann-Whitney U Test

This test deciding two samples whether or not
there is a difference between the samples, or
equivalently, whether or not they come from same
population.

The Mann-Whitney U Test consist of the following
step
Combine all sample value in an array from the
smallest to the largest, and assign rank to all
this value. If two or more samples values are
identical, the samples are each assigned a rank
equal to the mean that would otherwise be
assigned.
Find the sum of the ranks for each the samples
(R1 and R1), where N1 and N2 are respective
sample size (For convenience, choose N1 N2).
To test the difference between the rank sums use
the statistic
corresponding to sample 1.

The sampling distribution of U is symmetrical and
has a mean and variance given, respectively, by
the formulas
If N1 and N2 are both a least equal to 8, it turn
out that the distribution of U is nearly normal

Remark 3
A value corresponding to sample 2 is given by the
statistics
Value corresponding to statistics between sample
1 and sample 2 is related.
We also have
Where, NN1N2.
Remark 4
The statistic U in value corresponding by the
statistic to sample 1 is the total number of
times that sample 1 values precede sample 2
values when all sample values are arranged in
increasing order of magnitude. This provide an
alternative counting method for finding U.

13
13.3 COMPARING TWO POPULATIONS THE
WILCOXON RANK SUM TEST FOR INDEPENDENT SAMPLE

Wilcoxon Rank Sum Test
To test the hypothesis that the probability
distributions associated with the two populations
are equivalent.
Rank Sum
The totals of the rank for each of the two
sample.

14
13.3.1 Wilcoxon Rank Sum Test Independent
Samples

ONE-TAILED TEST
H0 Two sampled
populations have identical probability
distributions.
Ha The probability distribution for
population A is shifted to the right of that for
B.

TWO TAILED TEST
H0 Two sampled populations have identical
probability distributions.
Ha The probability distribution for population
A is shifted to the left or to the right of that
for B.

Test statistic
The rank sum T associated
with the sample with fewer
measurements (if sample
sizes are equal, either rank
sum can be used.)

Test statistic
The rank sum T associated
with the sample with fewer
measurements(if sample
sizes are equal, either rank
sum can be used.)

Rejection region
Assuming the smaller
sample size is associated
with distribution A, (if
sample sizes are equal, we
use the rank sum TA), we
reject the null hypothesis if
TA ? TU
where Tu is the upper value
given by Table XII in
Appendix A for the chosen
one- tailed ? value

Rejection region
T ? TL or T ? TU
where TL is the lower value
given by Table XII in
Appendix A for the chosen
two- tailed ? value and Tu
is the upper value from
Table XII

17
Note If the one- sided alternative is that the
probability distribution for A is shifted to the
left of B (and TA is the test statistic), we
reject null hypothesis if TA?TL

Assumptions 1. The two sample are random and
independent.
2. The two probability distributions
from which the samples are drawn
are continuous.
Ties
Assign tied measurements the average of the rank
they would receive if they were unequal but
occurred in successive order. For example, if the
third-ranked and fourth-ranked measurement is
tied, assign each a rank of
(34)/2 3.5

18
13.3.2 Wilcoxon Rank Sum Test Large
Independent Samples

ONE-TAILED TEST
H0 Two sampled populations have
identical probability distributions.
Ha The probability distribution for
population A is shifted to the right of that for
B.

TWO TAILED TEST
H0 Two sampled populations have identical
probability distributions.
Ha The probability distribution for population
A is shifted to the left or to the right of
that for B.

Test statistic z
Rejection region z gt z?
Rejection region z gt z?/2
Assumptions n1?10 and n2?10 Assumptions
n1?10 and n2? 10

20
13.4 COMPARING TWO POPULATION THE WILCOXON
SIGNED RANK TEST FOR THE PAIRED DIFFERENCE
EXPERIMENT13.4.1 Wilcoxon Rank Sum Test for a
Paired Difference Experiment

ONE-TAILED TEST
H0 Two sampled populations have identical
probability distributions.
Ha The probability distribution for
population A is shifted to the right of that for
population B.

TWO TAILED TEST
H0 Two sampled populations have identical
probability distributions.
Ha The probability distribution for population
A is shifted to the right or to the left of that
for population B.

Test statistic
T_, the negative rank sum
(we assume the differences
are computed by subtracting
each paired B measurement
from the corresponding A
measurement)
Rejection region
T_ ? T0 where T0 is found in
Table XIII (in Appendix A)
for the one-tailed significance
level ? and the number of
untied pairs, n.

Test statistic
T, the smaller of the positive and negative rank
sums T and T_
Rejection region
T ? T0 where T0 is found
in Table XIII (in Appendix A)
for the two-tailed significance
level ? and the number of
untied pairs, n.

22
Note If the alternative hypothesis is that the
probability distribution for A is shifted to the
left of B, we used T as the test statistic and
reject H0 if T ? T0

Assumptions 1. The sample of differences is
randomly
selected from the
population of differences.
2. The probability distribution from which
the
sample of paired differences is drawn is
continuous.
Ties
Assign tied absolute differences the average of
the ranks they
would received if they were unequal but occurred
in
successive order. For example, if the
third-ranked and fourth
ranked differences are tied, assign both a rank
of (34)/23.5

23
13.4.2 Wilcoxon Rank Sum Test for a Paired
Difference Experiment Large Sample

ONE-TAILED TEST
H0 Two sampled populations have identical
probability distributions.
Ha The probability distribution for
population A is shifted to the right of that for
population B.

TWO TAILED TEST
H0 Two sampled populations have identical
probability distributions.
Ha The probability distribution for population
A is shifted to the right or to the left of that
for population B.

24
Test statistic z

Rejection region z gt z? Rejection
region z gt z?/2
Assumptions n?25
Assumptions n?25

25
13.5 THE KRUSKAL-WALLIS H-TEST FOR A
COMPLETELY RANDOMIZED DESIGN

13.5.1 The Kruskal-Wallis H Test
This test is for deciding whether or not two
samples come from the same population.
Where
k Samples of size N1, N2, N3, , Nk
N Total size of all samples (N1 N2 N3,
Nk )
Suppose further that the data from all the
samples taken together are ranked and that the
sums of the ranks for the k samples are R1, R2,
, Rk, respectively.
Equation shows - Sampling distribution of H is
very nearly a chi-square distribution with k-1
degrees of freedom, provided that N1, N2, N3, ,
Nk are all at least 5.
Its provides a nonparametric method in the ANOVA
for one-way classification, or one-factor
experiments and generalization can be made.

13.5.2 The Kruskal-Wallis H-Test for Comparing
p Probability Distributions
H0 The p probability distribution are identical
Ha At least two of the p probability
distribution differ in location.

27
Test statistic H

where
nj Number of measurements in sample j
Rj Rank sum for sample j, where the rank of
each
measurement is computed according to its
relative magnitude in the totality of
data for the
p samples
n Total sample size n1 n2 .
np

28
Rejection region H lt with (p 1) degrees
of freedom

Assumptions 1. The p samples are random and
independent.
2. There are 5 or more measurements in each
sample.
3. The p probability distributions from
which
the samples are drawn are continuous.
Ties
Assign tied measurements the average of the ranks
they would
received if they were unequal but occurred in
successive order.
For example, if the third-ranked and
fourth-ranked measurements are tied, assign both
a rank of (34)/2 3.5.The number of ties should
be small relative to the total number of the
observations.

29
13.6 THE FRIEDMAN Fr-TEST FOR A RANDOMIZED
BLOCK DESIGN

13.6.1 Friedman Fr-Test for a Randomized
Block Design
H0 The probability distribution for the p
treatments are
identical.
Ha At least two of the probability
distributions differ in
location.

30
Test statistic Fr

Where
b Number of blocks
p number of treatments
Rj Rank sum of jth treatment, where the rank
of each measurements is
computed relative
to its position within its own
block.
Rejection region H lt with (p 1) degrees of
freedom

31
Assumptions 1. The treatments are randomly
assigned to experimental units
within the blocks. 2. The measurements can be
ranked within the blocks. 3. The p
probability distributions from
which the samples within each block
are drawn are
continuous.Ties Assign tied measurements
within a block the average of the ranks they
would receive if they were unequal but occurred
in successive order. For example, if the
third-ranked and fourth-ranked measurements are
tied, assign each a rank of (34)/2 3.5. The
number of ties should be small relative to the
total number of observations.
32
13.7 SPEARMANRANK CORRELATION COEFFICIENT

Where
ui Rank of the ith observation in sample 1
vi Rank of the ith observation in sample 1
n Numbers of pairs of observations (number of
observation in each sample)

33
Shortcut formula for rs
where di ui-vi (difference in the ranks of the
ith observation for sample 1 and 2)
34
13.7.1 Spearman s Nonparametric Test for Rank
Correlation

ONE-TAILED TEST
H0 ? 0
Ha ? gt 0 (or Ha ? lt 0 )

TWO TAILED TEST
H0 ? 0
Ha ? ? 0

Test statistic rs, the sample rank correlation
(see the formula for calculating rs).
35

Rejection region rs gt rs,?
(or rs lt -rs,? when Ha ?slt0)
where rs,? is the value from
Table XIV corresponding to
the upper-tail area ? and n
pairs of observations.

Rejection region rs gtrs,?/2
where rs,?/2 is the value from
Table XIV corresponding to
the upper-tail area ?/2 and n
pairs of observations.

36
Assumptions 1. The sample of experimental units
on which the two variables
are measured is randomly
selected. 2. The probability distributions of
the two variables are
continuous.

Ties
Assign tied measurements the average of the ranks
they would received if they were unequal but
occurred in successive order. For example, if the
third-ranked and fourth-ranked measurements are
tied, assign each a rank of (34)/2 3.5. The
number of ties should be small relative to the
total number of observations.

37
13.7.2 Spearman's Rank Correlation (rs)

To measure the correlation of two variables, X
and Y.
When precise values of the variables is
unavailable, the data may be ranked from 1 to N
in order to size, importance, etc.
If X and Y are ranked in such a manner,
coefficient of rank correlation is given by
Where
D denotes the differences between the rank of
corresponding of X and Y .
N the number of pairs of value (X,Y) in the data.