Revision - PowerPoint PPT Presentation

1 / 65

About This Presentation

Title:

Revision

Description:

Revision ... Compare whether the mean change (before vs after) is equal to ... study of patients admitted to an otolaryngology ward, 140 with nose bleeds were ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 66

Provided by: jillmo9

Category:

more less

Transcript and Presenter's Notes

Title: Revision

1
Revision

A hypothesis test is a formal way to determine if
a difference/change is due to an underlying
difference/change in the population or due to
chance.
Compare means from two independent
groups independent t-test
Compare whether the mean change (before vs after)
is equal to zero paired t-test

2
Hypothesis Test (1)
Paired samples t-test Compare the mean of the
differences from paired data (to zero) H0
?d 0 Vs H1 ?d ? 0
Test is based on data collected from a paired
data set. The statistics of interest are
mean difference and SDd sd
0
tn-1
Test statistic
SE( )
3
Hypothesis Test (2)
Independent samples t-test Compare the
population means in two independent groups
H0 ?A ?B Vs H1 ?A ? ?B
H0 ?A ?B 0 Vs H1 ?A ?B ? 0
Test is based on data collected from two
independent samples of sizes nA and nB. The
statistics of interest are
means and and SDs sA
and sB
4
0
( )
tn-2
Test statistic
SE( )
with n nA nB lt 30
NB For both tests, if ngt30 the test statistic
with t-distribution can be assumed normally
distributed with mean 0 and standard deviation 1,
N(0,1).
5
Assumptions
The assumption for the paired t-test is that the
differences have an approximately Normal
distribution. The assumptions for the
independent groups t-test is that the samples
come from populations having Normal distributions
with the same variance.
6
Six main tests
7
Hypothesis testing part 2

Gordon Prescott

8
Comparison of means

What if there are more than two groups?
Compare each pair of groups? NO
The appropriate statistical test for when there
are more than two groups is Analysis of Variance
(ANOVA)

9
Research question

Is there a relationship between smoking and body
weight?
Subjects smoking status is classified as
smoker
given up smoking
never smoked
We want to compare body weight between each of
the three smoking status categories

10
Null hypothesis

The null hypothesis is that there is no
difference in population means between the groups
The alternative hypothesis that there is a
difference in population means between the groups

11
Assumptions of ANOVA

The sample data should come from populations
which follow a normal distribution
The variances of these populations should be the
same (homoscedasticity)

12
Comparison of means after ANOVA

ANOVA will only indicate whether the means differ
or not, it will not inform you of where the
differences can be found (e.g. non-smokers
weight differs from smokers but non-smokers
weight does not differ from ex-smokers)
Multiple comparison procedures can be used to
test where the differences lie
e.g. Scheffes test

13
Descriptive statistics
Three means being compared
14
Levenes test
Check to see that the data in the three groups
are equally spread out or not.
15
ANOVA table
Formal hypothesis test to see whether the three
means are all equal or significantly different to
each other.
16
Multiple comparison test
17

SPSS practicals page 91
Task 2c requires ANOVA
Instructions on pages 92-93

18
Recap

What is hypothesis testing?
Theory
In practice
Comparison of two independent means
Comparison of two paired means
Comparison of more than 2 independent means

Independent t-test
Paired t-test
ANOVA
19
IMPORTANT

Open Door Sessions (1.005)
WEDNESDAYS FROM 2pm TO 4pm
Can make other appointments
Student Clinics (1.026)
DAILY FROM 1pm TO 2pm

20
Tomorrows Practical

SPSS exercise in page 91 of handbook,
SPSS data file as always in
F\data\Public Health\
Tasks 3 and 4 to be completed tomorrow.

21
HOW TO Print handouts of lectures, Add
comments in SPSS output, Export SPSS output to
Word, Read important values from tables.
22
(No Transcript)
23
(No Transcript)
24
Insert
New Text
25
NB Delete ALL Notes BEFORE Exporting Output to
Word
26
File
Export
?
choose where to save your word document and give
it a name (eg. output.doc).
?
27
NB To this new word file you can add more text,
delete and cut existing text, as well as shrink
and expand the graphs.
28
Assignment

Data file assignment2006.sav
and a word file icu assignment2006.doc with
another copy of the instructions are in
F\data\Public Health

29
Assignment

Consider how to describe the data
Formulate the research questions you would like
the data to answer
If a statistical test is required, decide what is
the most appropriate test and why
Then go to the data and find the information
required and carry out the statistical tests
Extract important information and include in
write up
Do NOT include large amounts of unedited SPSS
output

30
Deadline

The deadline for the assignment is
12 NOON, MONDAY 20th NOVEMBER
You can hand it in earlier if you wish
You are welcome to discuss this work with other
people
Please ensure that your report is your own work
and is different to the reports of other students
Copied or plagiarised reports will be referred to
the Head of Department

31
(No Transcript)
32
Comparing Proportions

Used for summarising qualitative data
Calculated for different categories
The proportions being compared usually come from
a crosstabulation of two categorical variables
Sample proportions may be used to draw inferences
about population proportions
These inferences may be expressed as confidence
intervals or used in hypothesis testing

33
Comparison of proportions from two independent
groups

Is there a difference in the population
proportion of men that are current smokers and
the population of women that are current smokers?
Is there an association between gender and
smoking status?

34
Null and alternative hypothesis

The null hypothesis is that there is no
difference in the population proportions of males
who smoke and females who smoke
(or no association exists between gender and
smoking status in the population)
The alternative hypothesis is that there is a
difference in the population proportions
(an association exists between gender and smoking
status in the population)

35
Chi-squared test for association

Continuity Correction
(tables with 2 rows and 2 columns)
Fishers Exact Test
(tables with 2 rows and 2 columns)
Pearson Chi-squared Test
(tables that have more than 2 rows or columns)
Chi square test for trend
(when at least one of the variables is ordinal)

36
Contingency table
Total of row
Total of column
Grand Total
37
Probability of two independent events

Recall
P(A and B) P(A) x P(B)
If two events are independent, the result of one
event is not dependent on the result of the other
event

38
Probability of an event

If independent then
P(male and smoker)
P(smoker) x P(male)
82 x 120
272 272
0.133

Therefore if smoking and gender are independent
of one another the probability that a person in
the population will be a male and a smoker is
0.133,
so in a sample of 272 people we would expect to
find
0.133 x 272 36.2 male smokers

39
Expected frequencies

If two events are independent to one another,
the expected frequency of events will be equal
to
Expected count
(row total x column total)/ grand total
82 x 120 / 272
36.2

40
Observed frequencies (expected frequencies)
41
Chi-squared test

The test of the null hypothesis is based on the
difference between the observed and expected
frequencies
Under the null hypothesis this test statistic
follows the Chi-squared distribution
The value of the test statistic is then compared
with the appropriate Chi-squared distribution
(first proposed by Pearson)

42
Chi-Squared

Another important distribution related to the
normal is the Chi-squared distribution.
It is used when investigating categorical data.
Large positive values occur with very low
probability.

43
Chi-squared test

The greater the differences between the observed
and expected statistics, the larger the
Chi-squared statistic is, the more evidence that
the two variables are associated.
Comparison of the observed Chi-squared statistic,
with tabulated critical values, will determine
whether the evidence of association is
significant at a given significance level.

44
Assumptions of Chi-squared test

When sample sizes are small, the expected
frequencies may be small. For the Chi-squared
test to be valid, no more than 20 of the cells
should have an expected frequency of less than 5
and no cells should have an expected frequency of
less than one.
If this does not hold the alternative test is
Fishers Exact test
(note for usually only for 2X2 tables)

45
SPSS output
Note that 33.3 of males and 27.6 of females
smoke
46
SPSS output
P-value
47
Hypothesis
48
Confidence intervals

The difference in proportions is approximately 6
(remember33.3 - 27.6)
95 CI for this difference can be obtained and is
6 (-5 to 17)
Note that the null hypothesised value 0 is
included in the 95 CI
This indicates that the null hypothesis can NOT
be rejected at the 5 significance level

49
Larger contingency tables

Have seen that the Chi-squared test can be used
to test for a difference between two proportions
The Chi-squared test can also be applied to
larger contingency tables
Example
Association between smoking (smoker, ex-smoker,
never smoked) and gender

50
Counts (expected frequencies)
51
SPSS output
P-value
52
Chi-squared test for association

Continuity Correction
for 2 x 2 tables
Fishers Exact Test
if greater than 20 of expected values are less
than 5 (calculated for 2 x 2 tables only)
Pearson
for tables that have more than 2 rows or columns.
Mantel-Haenszel Test for trend (Chi square test
for trend)
When one of the variables is ordinal

53
Conclusion

Recall
H0 No association exists between gender and
smoking status (i.e. the variables are
independent)
H1 There is an association between gender and
smoking status
There is no evidence to suggest that there is an
association between gender and smoking status

54
Comparison of proportions from two related groups

Matched case-control study was conducted to
investigate risk factors for diarrhoea in
children.
Does endometrial ablation (a conservative
alternative to hysterectomy) have an effect on
the presence of pain symptoms in women?
Discomfort was assessed before and 6 months after
surgery.

55
McNemar test

When the proportions are related (paired) the Chi
square test is no longer valid since the
observations in the contingency table are not
independent of one another.
The appropriate statistical test to apply is the
McNemar test

56
McNemar test
57
SPSS output
P-value
58
Confidence intervals

The difference in proportions and the 95 CI is
17 (5 to 28)
Note that the null hypothesised value of 0 is not
included in the 95 CI
Therefore the null hypothesis can be rejected at
the 5 significance level

59
McNemar test

24 women had symptom at pre-op and at post op
10 women had no symptom pre-op and still had no
symptom post op
26 women who had symptom pre-op no longer had the
symptom post op compared to 18 women who did not
have the symptom pre-op but did post-op

60
McNemar test

78 women had assessment pre and post operatively
P 0.291
No evidence to reject the null hypothesis
Surgical treatment has had no statistically
significant impact on whether patient has symptom

P-value
61
Hypothesis testing or estimation?

Quantification of the results by simple estimates
is an essential part of the analysis of data
A single number (P value) cannot convey all the
necessary information appropriate estimates and
confidence intervals are required as well

62
Example

In a study of patients admitted to an
otolaryngology ward, 140 with nose bleeds were
compared to 113 controls with other conditions.
Patients were interviewed about their alcohol
consumption (McGarry, 1994).

63
Data
64
Discuss the statement made by the authors

The proportion of non-drinkers in the patients
with nose bleeds was similar to that in the
controls (34 vs 35), but the proportion of
regular drinkers was significantly higher (45 vs
30), Plt0.025, ? test of proportions)

65
Discussion points

The authors appear to have tested each line of
the 3x2 cross tabulation
3 hypothesis tests using the same data
Increased chance of type I error
They should have done one single Chi-squared test
on all the data
Since categories of drinking are ordered an
alternative would been to have done a Chi-squared
test for trend
SPSS does both of these automatically when you
ask for a Chi-squared test