Title: Lecture 8 Hypothesis formulation and testing Contd''
1Lecture - 8Hypothesis formulation and testing
Contd..
2Test for normalityNormal distribution
not normal distribution
3- Non-parametric tests
- Distribution free tests
- Data are far from normal or data do not follow
any distribution pattern such as normal, linear,
binomial, exponential etc. e.g. no. of insects,
bacterial count, disease incidence, salary of
staff etc. - Few samples/replications can cause non-normality
- Problems in measurement e.g. GPA which do not
exactly measure the intelligence of students - Means, SD, SE, or variance do not represent the
data
4- Why non-parametric tests?
- Observations are independent of each other
- Scale of measurement is rank
- Have low power than the parametric tests if
parametric tests are not applicable only, these
methods should be applied - These methods are recently becoming popular as
distribution free data are quite common
5- Steps in non-parametric tests - Ranking
- Example 1
- Female heights (cm)193, 170, 188, 178, 183, 180,
185, - Male heights 175, 173, 163, 168, 165
- Methods
- Step 1
- Sorting by ascending
- or Descending order
- Step 2
- Ranking of the
- data from all
- the groups (it is the
- basic principle)
6- Ranking
- Example 2
- Tied ranks
- There are two 32
- they get
- 3 and 4 ranks
- therefore,
- averaged rank
- is 3.5
- Similarly, three 44
- with ranks 8, 9 10,
- therefore they all
- get the mean rank i.e. 9
7- Mann-Whitney test (U-test)
- Two groups (k2) i.e. similar to t-test for
non-normal distribution i.e. non-parametric - U n1n2 n1(n11)/2 R1
- Where,
- n1 is the number of samples in the first group
- n2 is the number of samples in the second group
- R1 is the sum of the ranks of the first group
- R2 is the sum of the ranks of the second group
- Here, assumption is n1 gt n2, but if n2 gt n1 then,
the equation should be - U n1n2 n2(n21)/2 R2
8- Mann-Whitney test
- Example 1
- H0 Males females
- n1 7 and n2 5
- U 7578/2 30
- 33
- U 0.05, 5, 7
- 30 (From table)
-
- Reject H0
9- Mann-Whitney test
- Example 2 Ordinal data
- H0 Males females
- n1 9 and n2 8
- U 98910/2 69.5
- 47.5
- U 0.05, 8, 9
- 57 (From table)
-
- Accept H0
- There is no difference
- between grades obtained
- by male female students
10- Wilcoxons test (paired samples)
- Also called Rank Sum, Matched pair and
Signed Rank tests - Analogous to paired t-test (but low power)
- Example test whether the new breed of goat has
longer hind-legs compared to the forelegs.
11Example H0 Hindleg foreleg Here, T
4.54.5779.5 79.52 51.5 T- 31 4 T
0.05, 10 8 (Table) And P lt0.05 Reject H0
(Hindleg is longer than foreleg)
Note if difference is zero, it is discarded
12Analysis of Variance (ANOVA)
- (Parametric test)
- Two means are compared with t-test, if more than
two means need ANOVA - H0 there are no differences among the means
- Comparison depends on purpose and objective or
the experimental design
13- Comparisons of five means
Means A B C D E
Freq.
Values
14- Experimental designs
- Completely Randomized Design (CRD)
- Randomized Complete Block Design (RCBD)
- Latin Square Design (LSD)
- Factorial Design
- One factor
- Two factors
- Multi-factors
15- Experimental designs
- 1. Completely Randomized Design (CRD)
- Assumptions
- all the experimental units are considered uniform
or identical - treatment allocation into experimental units is
completely random
Experimental units
16- Hypothesis is tested by comparing the variation,
therefore, called as Analysis of Variance (ANOVA) - between treatments with the variation among
treatments - If variation between treatments (Treatment
effect) is higher than the variation within
treatment (i.e. Random error), there is a
significant difference - Model
Yi ? Ti Ri
17Separation of variation
If Ti gt Ri treatment effect is significant
Yi Ri Ti ?
Random errors
Treatment effects
18- Also called as
- single factor experiment
- For examples
- Fertilization trials
- - Organic, in-organic and combination
- - 0, 40 and 60 kg N/ha/week etc.
- Crop/vegetable/fruits varieties
- Animal breeds
- Drug efficacy etc.
19- Randomization and layout
- Allocation of the treatments and replications is
done by lottery or using random numbers/table
- Determine the total number of experimental units
(n) t x r e.g. to test 6 varieties with 4
replications, you will need 24 plots - Assign plot number to each plot (1 to n)
- Assign treatments to the experimental plots by
using lottery or random table
201 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18 .? 24
21 Data analysis 1. Group the data by treatments
and calculate the treatment totals (T) and grand
total (G), the grand mean and the coefficient of
variation (c.v.) etc.2. Using number of
treatments (t) and the number of replications (r)
determine the degree of freedom (d.f.) for each
source of variation3. Construct an
outline/table (next slide) of the analysis of
variance
22ANOVA table of a CRD experiment
t number of treatments r number of
replicates per treatment
23- 4. Using Xi to represent the measurement of the
ith plot, Ti as the total of the ith treatment,
and n as the total number of experimental plots
i.e. n (r) (t) , calculate the correction
factor (CF) and the various sums of square (SS) - 5. Calculate the mean square (MS) for each source
of variation by dividing SS by their
corresponding d.f. - 6. Calculate the F- value (R.A. Fisher) for
testing significance of the treatment difference
(F MST/MSE) - 7. Enter all the values computed in the ANOVA
table
24- 8. Obtain the tabular F values with f1
treatment d.f. (t-1) and f2 error d.f. t
(r-1) and compare as follows
Statistical inference
25Example Four different feeds were tested on 20
pigs. Following were the mean final weights (kg)
of 19 pigs (1 pig died). Here, H0 ?1 ?2 ?3
?4
26Step 1 Calculate sum squares Correction factor
(C) (Grand total)2 /n (1482.2)2 /19
115,627 Total SS (60.8) 2 (57.0)2 -------
(90.3)2 - C 119,982-115627
4,355 Treatment SS ? (Treatment total)2/n
C (303.1)2 /5 (346.5)2 /5 (401.4)2 /4
(431.2)2 /5 - 115,627 4,226 Error SS Total
SS Treatment SS 4,355 4,226 128
27Step 2 Prepare an ANOVA table
Note Numerator df 3 Denominator d.f.
15 Reject H0 which means Treatment (feed) has
effect on pig growth but to compare among feeds
need test for Multiple comparisons
28- If ANOVA shows significant difference, we need
posteriori test such as - 1. Comparison between two means e.g. control
verses others - - Students t-test (as before)
- 2. Multiple comparisons or pair-wise comparisons
(compare all the possible combinations
simultaneously or ranking is possible) - - LSD (Least significant difference)
- - DMRT (Duncans multiple range test)
- - Tukeys HSD (Tukeys Honestly Significant
Difference Test) - Note If ANOVA shows no significant difference
there multiple range test are not necessary
29- 2. Multiple comparisons or pair-wise comparisons
- Calculate the common value for difference using
pooled variance such as -
- SE (X1-X2) v (S2 (1/ N1 1/N2)
- v 8.557 (1/51/5) 1.85 g
- t 0.05, 15 df 2.131, 95 CI 1.852.131
3.94 g
Reject H0 - all means are different Results ?1
lt ?2 lt ?4 lt ?3
30- 2. Multiple comparisons or Post Hoc Test
-
Widely accepted
Not suggested
31Homogeneous Subsets
Non-significant means are shown in the same
column.
Widely accepted
32- 2. Final result presentation tabular
- Table no Mean weights of pigs fed with 4 diets
during the trial.
Values with the same superscripts are not
significantly different at 0.05
33- 2. Final result presentation (Graphical)
- Figure no Mean weights (kg ? 95 confidence
intervals) of pigs fed with 4 diets during the
trial.
d
c
b
a
34- CRD ANOVA vs Multiple range tests
- Adv.
- High proportion of degree of freedom thus it is
suitable for smaller experiments with fewer
experimental units. - It is stronger than multiple range tests
therefore it is done before multiple range tests - Disad.
- If experimental units are not homogenous, there
will be an increased experimental error - It doesnt compare among the means or does not
locate the differences
35Some useful websites related to
ANOVA http//www.physics.csbsju.edu/stats/anova.
html http//www.psychstat.smsu.edu/introbook/sbk2
7.htm