Title: Biostatistics%20in%20Research%20Practice:%20Non-parametric%20tests
1Biostatistics in Research Practice
Non-parametric tests
2What is a non-parametric test ?
- Methods of analysis that do not assume a
particular family of distributions for the data.
3When to use a non-parametric test
- Non-parametrics are distribution free
- Data may be rank ordered (Ordinal data)
- Data may be from small samples
- There may be non-normal distribution of the
variables (Skewed data) - Outliers may be present
4Non-parametric v Parametric tests
- Usually only perform one analysis of a data set
choosing between parametric and non-parametric
methods. - It is usual to use a parametric method, unless
there is a clear indication that it is not valid. - It is important to realise that if we apply
different tests to the same data then we do not
expect them to give the same answer, but in
general two valid methods will give similar
answers. - Non-parametric tests are less powerful than the
equivalent parametric test (especially in small
samples) and will tend to give a less significant
(larger) p-value
5Dealing with ordinal data
- Non-parametric tests are usually based on Order
Statistics and Ranks - ORDER STATISTICS the observations arranged in
increasing order of size. - RANKS their places in this order
6Ordering data
Data 7 9 10 12 12 9 12 11 -13
Ordered
Ranked
7Ordering data
Data 7 9 10 12 12 9 12 11 -13
Ordered -13 7 9 9 10 11 12 12 12
Ranked
8Ordering data
Data 7 9 10 12 12 9 12 11 -13
Ordered -13 7 9 9 10 11 12 12 12
Ranked 1 2 3.5 3.5 5 6 8 8 8
9Wilcoxon Signed Rank Test
- Non-parametric equivalent for testing or
estimating location for a one sample problem
(equivalent to one sample t-test) OR for paired
samples (equivalent to the paired t-test) - Assumptions
- A random sample of n independent observations OR
independent random pairs are taken. - The variable of interest is the difference (d)
- For the one sample problem d Observed value -
Hypothesised value - For paired data d X - Y - where X is value at
time 1 and Y is value at time 2. - The level of measurement is at least ordinal.
10Examples
- Anxiety levels pre and post operation
- Pain levels pre and post operation
- Yorkshires fruit and veg consumption vs
recommended 5 a day - BP pre and post exercise
11Friedman Test
- The assumption that the residuals have a Normal
distribution cannot be assessed before fitting
the model. - Sometimes, however, it can be seen from the raw
data that the model will not fit well. In
particular, wide variation in the standard
deviations for each row and column will suggest
problems with the parametric two-way ANOVA. - Therefore, we have a non-parametric equivalent of
the two way ANOVA that can be used for data sets
which do not fulfill the assumptions of the
parametric method. - The method, which is sometimes known as
Friedmans two way analysis of variance, is
purely a hypothesis test.
12Examples
- Time periods
- Pre op, post op and 12 months
- Baseline, week 2, week 12
13Mann Whitney U test
- Non-parametric equivalent of two sample t-test.
- The Mann-Whitney test is used to compare two sets
of data from independent groups. - It is the most commonly used alternative to the
independent samples t-test. - The values from both samples are combined and
then the data is ranked from smallest to largest.
The rank of 1 is assigned to the smallest value,
2 to the next smallest and so on. If the ranks
are tied, then the average rank is used. - Assumptions
- There are two independent random variables (X and
Y), of size n and m. - The variable of interest is a continuous random
variable. - The two populations differ only with respect to
location.
14Examples
- Comparing two groups e.g.
- Anxiety between men and women
- Control group and an intervention group
15Kruskal Wallis Test
- Just as the one way analysis of variance is a
more general form of the t-test, there is a one
for the non-parametric Mann-Whitney test. - The Kruskal-Wallis test is an obvious
mathematical extension of the Mann-Whitney test. - Assumptions
- There are three or more independent random
variables (X1 , X2, X3.Xn, ), of size n1 ,n2,
n3.nn - The variable of interest is ordinal or a
continuous random variable which is non-normal. - The populations differ only with respect to
location.
16Examples
- Comparing more than 2 groups, e.g.
- Contol group and two intervention groups
Satisfaction with procedure - Age groups 18-30 30-50, 50
- Social class groups
17Choosing an appropriate method of analysis
- Number of groups of observations
- Independent or dependent groups of observations
- The type of data
- The distribution of data
- The objective of the analysis
18Choice of Test