Lecture 8 Hypothesis formulation and testing Contd'' - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

Lecture 8 Hypothesis formulation and testing Contd''

Description:

Lecture - 8. Hypothesis formulation. and testing. Contd.. Test for normality ... 2. Multiple comparisons or pair-wise comparisons (compare all the possible ... – PowerPoint PPT presentation

Number of Views:157

Avg rating:3.0/5.0

Slides: 36

Provided by: stweb

Category:

more less

Transcript and Presenter's Notes

Title: Lecture 8 Hypothesis formulation and testing Contd''

1
Lecture - 8Hypothesis formulation and testing
Contd..
2
Test for normalityNormal distribution
not normal distribution
3

Non-parametric tests
Distribution free tests
Data are far from normal or data do not follow
any distribution pattern such as normal, linear,
binomial, exponential etc. e.g. no. of insects,
bacterial count, disease incidence, salary of
staff etc.
Few samples/replications can cause non-normality
Problems in measurement e.g. GPA which do not
exactly measure the intelligence of students
Means, SD, SE, or variance do not represent the
data

Why non-parametric tests?

Observations are independent of each other
Scale of measurement is rank
Have low power than the parametric tests if
parametric tests are not applicable only, these
methods should be applied
These methods are recently becoming popular as
distribution free data are quite common

Steps in non-parametric tests - Ranking
Example 1
Female heights (cm)193, 170, 188, 178, 183, 180,
185,
Male heights 175, 173, 163, 168, 165
Methods
Step 1
Sorting by ascending
or Descending order
Step 2
Ranking of the
data from all
the groups (it is the
basic principle)

Ranking
Example 2
Tied ranks
There are two 32
they get
3 and 4 ranks
therefore,
averaged rank
is 3.5
Similarly, three 44
with ranks 8, 9 10,
therefore they all
get the mean rank i.e. 9

Mann-Whitney test (U-test)
Two groups (k2) i.e. similar to t-test for
non-normal distribution i.e. non-parametric
U n1n2 n1(n11)/2 R1
Where,
n1 is the number of samples in the first group
n2 is the number of samples in the second group
R1 is the sum of the ranks of the first group
R2 is the sum of the ranks of the second group
Here, assumption is n1 gt n2, but if n2 gt n1 then,
the equation should be
U n1n2 n2(n21)/2 R2

Mann-Whitney test
Example 1
H0 Males females
n1 7 and n2 5
U 7578/2 30
33
U 0.05, 5, 7
30 (From table)
Reject H0

Mann-Whitney test
Example 2 Ordinal data
H0 Males females
n1 9 and n2 8
U 98910/2 69.5
47.5
U 0.05, 8, 9
57 (From table)
Accept H0
There is no difference
between grades obtained
by male female students

Wilcoxons test (paired samples)
Also called Rank Sum, Matched pair and
Signed Rank tests
Analogous to paired t-test (but low power)
Example test whether the new breed of goat has
longer hind-legs compared to the forelegs.

11
Example H0 Hindleg foreleg Here, T
4.54.5779.5 79.52 51.5 T- 31 4 T
0.05, 10 8 (Table) And P lt0.05 Reject H0
(Hindleg is longer than foreleg)
Note if difference is zero, it is discarded
12
Analysis of Variance (ANOVA)

(Parametric test)
Two means are compared with t-test, if more than
two means need ANOVA
H0 there are no differences among the means
Comparison depends on purpose and objective or
the experimental design

Comparisons of five means

Means A B C D E
Freq.
Values
14

Experimental designs
Completely Randomized Design (CRD)
Randomized Complete Block Design (RCBD)
Latin Square Design (LSD)
Factorial Design
One factor
Two factors
Multi-factors

Experimental designs
1. Completely Randomized Design (CRD)
Assumptions
all the experimental units are considered uniform
or identical
treatment allocation into experimental units is
completely random

Experimental units
16

Hypothesis is tested by comparing the variation,
therefore, called as Analysis of Variance (ANOVA)
between treatments with the variation among
treatments
If variation between treatments (Treatment
effect) is higher than the variation within
treatment (i.e. Random error), there is a
significant difference
Model

Yi ? Ti Ri
17
Separation of variation
If Ti gt Ri treatment effect is significant
Yi Ri Ti ?
Random errors
Treatment effects
18

Also called as
single factor experiment
For examples
Fertilization trials
- Organic, in-organic and combination
- 0, 40 and 60 kg N/ha/week etc.
Crop/vegetable/fruits varieties
Animal breeds
Drug efficacy etc.

Randomization and layout
Allocation of the treatments and replications is
done by lottery or using random numbers/table

Determine the total number of experimental units
(n) t x r e.g. to test 6 varieties with 4
replications, you will need 24 plots
Assign plot number to each plot (1 to n)
Assign treatments to the experimental plots by
using lottery or random table

Randomization

1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18 .? 24
21
Data analysis 1. Group the data by treatments
and calculate the treatment totals (T) and grand
total (G), the grand mean and the coefficient of
variation (c.v.) etc.2. Using number of
treatments (t) and the number of replications (r)
determine the degree of freedom (d.f.) for each
source of variation3. Construct an
outline/table (next slide) of the analysis of
variance
22
ANOVA table of a CRD experiment
t number of treatments r number of
replicates per treatment
23

4. Using Xi to represent the measurement of the
ith plot, Ti as the total of the ith treatment,
and n as the total number of experimental plots
i.e. n (r) (t) , calculate the correction
factor (CF) and the various sums of square (SS)
5. Calculate the mean square (MS) for each source
of variation by dividing SS by their
corresponding d.f.
6. Calculate the F- value (R.A. Fisher) for
testing significance of the treatment difference
(F MST/MSE)
7. Enter all the values computed in the ANOVA
table

8. Obtain the tabular F values with f1
treatment d.f. (t-1) and f2 error d.f. t
(r-1) and compare as follows

Statistical inference
25
Example Four different feeds were tested on 20
pigs. Following were the mean final weights (kg)
of 19 pigs (1 pig died). Here, H0 ?1 ?2 ?3
?4
26
Step 1 Calculate sum squares Correction factor
(C) (Grand total)2 /n (1482.2)2 /19
115,627 Total SS (60.8) 2 (57.0)2 -------
(90.3)2 - C 119,982-115627
4,355 Treatment SS ? (Treatment total)2/n
C (303.1)2 /5 (346.5)2 /5 (401.4)2 /4
(431.2)2 /5 - 115,627 4,226 Error SS Total
SS Treatment SS 4,355 4,226 128
27
Step 2 Prepare an ANOVA table
Note Numerator df 3 Denominator d.f.
15 Reject H0 which means Treatment (feed) has
effect on pig growth but to compare among feeds
need test for Multiple comparisons
28

If ANOVA shows significant difference, we need
posteriori test such as
1. Comparison between two means e.g. control
verses others
- Students t-test (as before)
2. Multiple comparisons or pair-wise comparisons
(compare all the possible combinations
simultaneously or ranking is possible)
- LSD (Least significant difference)
- DMRT (Duncans multiple range test)
- Tukeys HSD (Tukeys Honestly Significant
Difference Test)
Note If ANOVA shows no significant difference
there multiple range test are not necessary

2. Multiple comparisons or pair-wise comparisons
Calculate the common value for difference using
pooled variance such as
SE (X1-X2) v (S2 (1/ N1 1/N2)
v 8.557 (1/51/5) 1.85 g
t 0.05, 15 df 2.131, 95 CI 1.852.131
3.94 g

Reject H0 - all means are different Results ?1
lt ?2 lt ?4 lt ?3
30

2. Multiple comparisons or Post Hoc Test

Widely accepted
Not suggested
31
Homogeneous Subsets
Non-significant means are shown in the same
column.
Widely accepted
32

2. Final result presentation tabular
Table no Mean weights of pigs fed with 4 diets
during the trial.

Values with the same superscripts are not
significantly different at 0.05
33

2. Final result presentation (Graphical)
Figure no Mean weights (kg ? 95 confidence
intervals) of pigs fed with 4 diets during the
trial.

d
c
b
a
34

CRD ANOVA vs Multiple range tests
Adv.
High proportion of degree of freedom thus it is
suitable for smaller experiments with fewer
experimental units.
It is stronger than multiple range tests
therefore it is done before multiple range tests
Disad.
If experimental units are not homogenous, there
will be an increased experimental error
It doesnt compare among the means or does not
locate the differences