Title: Analyzing Data using SPSS
1Analyzing Data using SPSS
2Testing for difference
3Parametric Test
4t-test
- Is used in a variety of situations involving
interval and ratio variables. - Independent Samples
- Dependent - Samples
5Independent-Samples T-Test
- What it does The Independent Samples T Test
compares the mean scores of two groups on a given
variable.
6- Where to find it Under the Analyze menu, choose
Compare Means, the Independent Samples T Test.
Move your dependent variable into the box marked
"Test Variable." Move your independent variable
into the box marked "Grouping Variable." Click on
the box marked "Define Groups" and specify the
value labels of the two groups you wish to
compare.
7- Assumptions-The dependent variable is normally
distributed. You can check for normal
distribution with a Q-Q plot.-The two groups
have approximately equal variance on the
dependent variable. You can check this by looking
at the Levene's Test. See below.-The two groups
are independent of one another
8- HypothesesNull The means of the two groups are
not significantly different.Alternate The means
of the two groups are significantly different.
9SPSS Output
- Following is a sample output of an independent
samples T test. We compared the mean blood
pressure of patients who received a new drug
treatment vs. those who received a placebo (a
sugar pill).
10- First, we see the descriptive statistics for the
two groups. We see that the mean for the "New
Drug" group is higher than that of the "Placebo"
group. That is, people who received the new drug
have, on average, higher blood pressure than
those who took the placebo.
11 Our
- Finally, we see the results of the Independent
Samples T Test. Read the TOP line if the
variances are approximately equal. Read the
BOTTOM line if the variances are not equal. Based
on the results of our Levene's test, we know that
we have approximately equal variance, so we will
read the top line
12- Our T value is 3.796.
- We have 10 degrees of freedom.
- There is a significant difference between the two
groups (the significance is less than .05). - Therefore, we can say that there is a significant
difference between the New Drug and Placebo
groups. People who took the new drug had
significantly higher blood pressure than those
who took the placebo. -
13Example Independent samples t test
- A study to determine the effectiveness of an
integrated statistics/experimental methods course
as opposed to the traditional method of taking
the two courses separately was conducted. - It was hypothesized that the students taking the
integrated course would conduct better quality
research projects than students in the
traditional courses as a result of their
integrated training. - Ho there is no difference in students
performance as a result of the integrated versus
traditional courses. - H1 students taking the integrated course would
conduct better quality research projects than
students in the traditional courses
14Output SPSS
- Students taking the integrated course would
conduct better - quality research projects than students in the
traditional courses
15Exercise1
- The following data were obtained in an experiment
designed to check whether there is a systematic
difference in the weights (in grams) obtained
with two different scales.
Rock specimen Scale I Scale II
1 2 3 4 5 6 7 8 9 10 12.13 17.56 9.33 11.40 28.62 10.25 23.37 16.27 12.40 24.78 12.17 17.61 9.35 11.42 28.61 10.27 23.42 13.26 12.45 24.75
16- Use the 0.01 level of significance to test
whether the difference between the means of the
weights obtained with the two scales is
significant - Ho there is no significant difference between
the means of the weight obtained with the two
scales. - H1 there is significant difference between the
means of the weight obtained with the two scales.
17Exercise 2
- The following are the scores for random samples
of size ten which are taken from large group of
trainees instructed by the two methods. - Method 1 teaching machine as well as some
personal attention by an instructor - Method 2 straight teaching-machine instruction
Method 1 81 71 79 83 76 75 84 90 83 78
Method 2 69 75 72 69 67 74 70 66 76 72
What we can conclude about the claim that the
average amount by which the personal attention
of an instructor will improve trainees score.
Use ?5.
18Paired samples t-test
19Paired Samples T Test
- What it does The Paired Samples T Test compares
the means of two variables. It computes the
difference between the two variables for each
case, and tests to see if the average difference
is significantly different from zero.
20Paired Samples T Test
- Where to find it Under the Analyze menu, choose
Compare Means, then choose Paired Samples T Test.
Click on both variables you wish to compare, then
move the pair of selected variables into the
Paired Variables box.
21Paired Samples T Test
- Assumption-Both variables should be normally
distributed. You can check for normal
distribution with a Q-Q plot.
22Paired Samples T Test
- HypothesisNull There is no significant
difference between the means of the two
variables.Alternate There is a significant
difference between the means of the two variables
23SPSS Output
- Following is sample output of a paired samples T
test. We compared the mean test scores before
(pre-test) and after (post-test) the subjects
completed a test preparation course. We want to
see if our test preparation course improved
people's score on the test.
24First, we see the descriptive statistics for both
variables.
- The post-test mean scores are higher than
pre-test scores
25Next, we see the correlation between the two
variables
- There is a strong positive correlation. People
who did well on the pre-test also did well on the
post-test.
26- Finally, we see the results of the Paired Samples
T Test. Remember, this test is based on the
difference between the two variables. Under
"Paired Differences" we see the descriptive
statistics for the difference between the two
variables
27(No Transcript)
28To the right of the Paired Differences, we see
the t, degrees of freedom, and significance.
The t value -2.171 We have 11 degrees of
freedom Our significance is .053 If the
significance value is less than .05, there is a
significant difference.If the significance value
is greater than. 05, there is no significant
difference. Here, we see that the significance
value is approaching significance, but it is not
a significant difference. There is no difference
between pre- and post-test scores. Our test
preparation course did not help!
29Example
- Twenty first-grade children and their parents
were selected for a study to determine whether a
seminar instructing on inductive parenting
techniques improve social competency in children.
The parents attended the seminar for one month.
The children were tested for social competency
before the course began and were retested six
months after the completion of the course.
30Hypothesis
- Ho there is no significant difference between
the means of pre and post seminar social
competency scores - In other words, the parenting seminar has no
effect on child social competency scores
31- There is a strong positive correlation. children
who did well on the pre-test also did well on the
post-test.
There is significant difference between pre- and
post-test scores. the parenting seminar has
effect on child social competency scores!
32Exercise 3
- The table below shows the number of words per
minute readings of 20 student before and after
following a particular method that can improve
reading.
Student Pre Post
11 50 64
12 56 62
13 75 87
14 49 62
15 66 62
16 86 90
17 90 84
18 58 62
19 41 40
20 82 77
Student Pre Post
1 48 57
2 89 102
3 78 81
4 50 61
5 70 74
6 98 100
7 78 83
8 98 86
9 58 67
10 61 71
33- Using a 0.05 level of significance, test the
claim that the method is effective in improve
reading.
34Exercise 4
- The table below shows the weight of seven
subjects before and after following a particular
diet for two months - Subject A B C D E F G
- After 156 165 196 198 167 199 164
- Before 149 156 194 203 153 201 152
- Using a 0.01 level of significance, test the
claim that the diet is effective in reducing
weight.
35One-WayANOVA
- Similar to a t-test, in that it is concerned with
differences in means, but the test can be applied
on two or more means. - The test is usually applied to interval and ratio
data types. For example differences between two
factors (1 and 2). - The test can be undertaken using the Analyze -
Compare Means - One-Way ANOVA menu items, then
select for appropriate variables. - You will observe the One-Way ANOVA for factor 1
and factor 2
36Procedure
- 1. You will need one column of group codes
labelling which group your data belongs to. The
codes need to be numerical, but can be labelled
with text. - 2. You will also need a column containing the
data points or scores you wish to analyze. - 3. Select One-way ANOVA from the Analyze and
Compare Means menus. - 4. Click on your dependent variables (data
column) and click on the top arrow so that the
selected column appears in the dependent list
box. - 5. Click on your code column (your condition
labels) and click on the bottom arrow so that the
selected column appears in the factor box.
37- 6. Click on Post Hoc if you wish to perform
post-hoc tests.(optional). - 7. Choose the type of post-hoc test(s) you wish
to perform by clicking in the small box next to
your choice until a tick appears. Tukey's and
Scheffe's tests are commonly used. - 8. Click on Dunnett to perform a Dunnett's test
which allows you to compare experimental groups
with a control group.Choose whether your control
category is the first or last code entered in
your code column.
38- The main output table is labelled ANOVA. The
F-ratio of the ANOVA, the degrees of freedom and
the significance are all displayed. The top value
of the df column is the df of the factor, the
bottom value is the df of the error term. - Tukey's test will also try to find combinations
of similar groups or conditions. - In the Score table there will be one column for
each pair of conditions that are shown to be
'similar'. The mean of each condition within the
pair are given in the appropriate column. The
p-value for the difference between the means of
each pair of groups is given at the bottom of the
appropriate column.
39Example one-way ANOVA
- We would like to determine whether the scores on
a test of aggression are different across 4
groups of children (each with 5 subjects) - Each child group has been exposes to differing
amounts of time watching cartoons depicting toon
violence
40At the 0.05 significance level, test the claim
that the four groups have the same mean if the
following sample results have been obtained.
41Output SPSS
42Exercise 5
- At the same time each day, a researcher records
the temperature in each of three greenhouses. The
table shows the temperatures in degree Fahrenheit
recorded for one week. - Greenhouse 1 greenhouse 2 greenhouse 3
- 73 71 61
- 72 69 63
- 73 72 62
- 66 72 61
- 68 65 60
- 71 73 62
- 72 71 59
- Use a 0.05 significance level to test the claim
that the average temperature is the same in each
greenhouse.
43Nonparametric Test
44Sign Test
- A sign test compares the number of positive and
negative differences between related conditions
45Procedure
- 1. You should have data in two or more columns -
one for each condition tested. - 2. Select 2 Related Samples from the Analyze -
Nonparametric Tests menu. - 3. Click on the first variable in the pair and
the second variable in the pair. - The names of the variables appear in the current
selections section of the dialogue box. - 5. Click on the central selection arrow when you
are happy with the variable pair selection. - The chosen pair appairs in the Test Pair(s) List.
- Make sure the Sign box is ticked and remove the
tick from the Wilcoxon box
46Example
- The data in table on the next slide are matched
pairs of heights obtained from a random sample of
12 male statistics students. Each student
reported his height, then his weight was
measured. Use a 0.05 significance level to test
the claim that there is no difference between
reported height and measured height.
47Reported and measured height of male statistics
student
Reported height 68 74 82.25 66.5 69 68 71 70 70 67 68 70
Measured height 66.8 73.9 74.3 66.1 67.2 67.9 69.4 69.9 68.6 67.9 67.6 68.8
Ho there is no significant difference between
reported heights and measured heights H1
there is a difference
48Output
Reject Ho. There is sufficient evidence to reject
the claim that no significant difference between
the reported and measured heights.
49Exercise 6
- Listed here are the right- and left-hand reaction
times collected from 14 subject with right
handed. Use 0.05 significance level to test the
claim of no difference between the right hand-
and left-hand reaction times.
50Right/left reaction times
Right 191 97 116 165 116 129 171 155 112 102 188 158 121 133
Left 224 171 191 207 196 165 171 165 140 188 155 219 177 174
51Wilcoxon
- The Wilcoxon test is used with two columns of
non-parametric related (linked) data. - Either one person has taken part in two
conditions or paired participants (e.g. brother
and sister) have taken part in the same
condition. - This is the non-parametric equivelant of the
paired sample t-test
52Procedure
- 1. Put your data in two or more columns, one for
each condition tested. - 2. Select 2 Related Samples from Analyze -
Nonparametric Tests menu. - 3. Click on the first variable in the pair.
- 4. Click on the second variable in the pair.
- 5. Make sure the Wilcoxon box is ticked
- The Ranks table produced in the output window
summarises the ranking process. - In the Test Statistics table the Z statistic is
the result of the Wilcoxon test. - The p-value for this statistic is shown below it.
This is the two tailed significance.
53Example
- Use the previous data to test the claim that
there is no difference between reported heights
and measured heights using Wilcoxon test at 0.05
significance level.
54Output
Reject Ho. There is sufficient evidence to reject
the claim that no difference between reported
and measured heights.
55Mann-Whitney
- The Mann-Whitney test is used with two columns of
independent (unrelated) non-paramteric data.This
is the non-parametric equivalent of the
independent samples t-test.
56Procedure
- Put all of your measured data into one column.
- 2. Make a second column that contains codes to
indicate the group from which each value was
obtained. - 3. Select 2 Independent Samples from the Analyze
- Nonparametric Tests menu. - 4. Select the column containing the data you want
to analyse and click the top arrow. - 5. Select the Grouping Variable - the column
which contains your group codes - and click the
bottom arrow. - 6. Make sure the Mann-Whitney U option is
selected.
57- The output is produced in the output window.
- The top table summarises the ranking process.
- The result of the Mann-Whitney test is given at
the top of the Test Statistics table. - The two-tailed significance of the result is
given in the same table.
58Example
- One study used x-ray computed tomography (CT) to
collect data on brain volumes for a group of
patients with obsessive-compulsive disorders and
a control group of healthy persons. The following
data shows sample results (in mm) for volumes of
the right cordate.
59Volumes of the right cordate
Obsessive-compulsive patients 0.308 0.210 0.304 0.344 0.407 0.455 0.287 0.288 0.463 0.334 0.340 0.305
Control group 0.519 0.476 0.413 0.429 0.501 0.402 0.349 0.594 0.334 0.483 0.460 0.445
60Output
61Kruskal-Wallis
- examines differences between 3 or more
independent groups or conditions.
62Procedure
- 1 Put all your measured data into one column.
- 2. Make a second column that contains codes to
indicate the group from which each value was
obtained. - 3. Select K Independent Samples from the Analyze
- Non-parametric Tests menu. - 4. Select the grouping variable, the column that
contains your group codes, then click on the
bottom arrow. - Make sure the Kruskal-Wallis box is checked
- In the output window the chi-square statistic is
shown in the test statistic section, as is the
P-value.
63Example
- We would like to determine whether the scores on
a test of Spanish are different across three
different methods of learning - Method 1 classroom instruction and language
laboratory - Method 2 only classroom instruction
- Method3 only self-study in language laboratory.
64The following are the final examination scores of
samples of students from the three group
Method 1 94 88 91 74 86 97 Method 2 85 82 79
84 61 72 80 Method 3 89 67 72 76 69
At the 0.05 level of significance, test the null
hypothesis that the population sampled are
identical .
65Output SPSS
66Exercise 7
- The following are the miles per gallon which a
test driver got in random samples of six tankfuls
of each of three kinds of gasoline - Gasoline 1 30 15 32 27 24 29
- Gasoline 2 17 28 20 33 32 22
- Gasoline 3 19 23 32 22 18 25
- Test the claim that there is no difference in the
true average mileage yield of the three kinds of
gasoline. (use 0.05 level of significance)
67Testing for Relationships
68Pearson's Correlation
- Pearson's correlation is a parametric test for
the strength of the relationship between pairs of
variables.
69- What it does The Pearson R correlation tells you
the magnitude and direction of the association
between two variables that are on an interval or
ratio scale.
70- Where to find it Under the Analyze menu, choose
Correlations. Move the variables you wish to
correlate into the "Variables" box. Under the
"Correlation Coefficients," be sure that the
"Pearson" box is checked off.
71- Assumption -Both variables are normally
distributed. You can check for normal
distribution with a Q-Q plot.
72- HypothesesNull There is no association between
the two variables.Alternate There is an
association between the two variables.
73- SPSS Output
- Following is a sample output of a Pearson R
correlation between the Rosenberg Self-Esteem
Scale and the Assessing Anxiety Scale.
74SPSS creates a correlation matrix of the two
variables. All the information we need is in the
cell that represents the intersection of the two
variables
SPSS gives us three pieces of information -the
correlation coefficient-the significance-the
number of cases (N
75- The correlation coefficient is a number between
1 and -1. This number tells us about the
magnitude and direction of the association
between two variables. - The MAGNITUDE is the strength of the correlation.
The closer the correlation is to either 1 or -1,
the stronger the correlation. If the correlation
is 0 or very close to zero, there is no
association between the two variables. Here, we
have a moderate correlation (r -.378).
76- The DIRECTION of the correlation tells us how the
two variables are related. If the correlation is
positive, the two variables have a positive
relationship (as one increases, the other also
increases). If the correlation is negative, the
two variables have a negative relationship (as
one increases, the other decreases). Here, we
have a negative correlation (r -.378). As
self-esteem increases, anxiety decreases
77Example
- The following data were obtained in a study of
the relationship between the resistance (ohms)
and the failure time (minutes) of certain
overloaded resistors. - Resistance 48 28 33 40 36 39 46 40 30 42 44 48
39 34 47 - Failure time 45 25 39 45 36 35 36 45 34 39 51 41
38 32 45 - Test the null hypothesis that there is a
significant correlation between resistance and
failure time.
78Output SPSS
There is significant positive correlation between
resistance and failure time, indicating that
failure time increases as resistance increases.
79Exercise 8
- An aerobics instructor believes that regular
aerobic exercise is related to greater mental
acuity, stress reduction, high self-esteem, and
greater overall life satisfaction. - She asked a random sample of 30 adult to fill out
a series of questionnaire. - The result are as followstest whether there is
significant correlation between aerobic exercise
and high self-esteem
80Subject Exercise Self-esteem Satisfaction stress
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 10 33 9 14 3 12 7 15 3 21 2 20 4 8 0 25 37 12 32 22 31 30 30 15 34 18 37 19 33 10 45 40 30 39 27 44 39 40 46 50 29 47 31 38 25 20 10 13 15 29 22 13 20 25 10 33 5 23 21 30
Subject Exercise Self-esteem Satisfaction stress
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 17 25 2 18 3 27 4 8 10 0 12 5 7 30 14 35 39 13 35 15 35 17 20 22 14 35 20 29 40 30 42 40 30 47 28 39 32 34 41 27 35 30 30 48 45 13 10 27 9 25 7 34 20 15 35 20 23 12 14 15
81The Spearman Rho correlation
82The Spearman Rho correlation
- What it does The Spearman Rho correlation tells
you the magnitude and direction of the
association between two variables that are on an
interval or ratio scale.
83The Spearman Rho correlation
- Where to find it Under the Analyze menu, choose
Correlations. Move the variables you wish to
correlate into the "Variables" box. Under the
"Correlation Coefficients," be sure that the
"Spearman" box is checked off.
84The Spearman Rho correlation
- Assumption -Both variables are NOT normally
distributed. You can check for normal
distribution with a Q-Q plot. If the variables
are normally distributed, use a Pearson R
correlation.
85The Spearman Rho correlation
- HypothesesNull There is no association between
the two variables.Alternate There is an
association between the two variables.
86SPSS Output
- Following is a sample output of a Spearman Rho
correlation between the Rosenberg Self-Esteem
Scale and the Assessing Anxiety Scale.
87- SPSS creates a correlation matrix of the two
variables. All the information we need is in the
cell that represents the intersection of the two
variables. - SPSS gives us three pieces of information -the
correlation coefficient-the significance-the
number of cases (N)
88- The correlation coefficient is a number between
1 and -1. This number tells us about the
magnitude and direction of the association
between two variables. - The MAGNITUDE is the strength of the correlation.
The closer the correlation is to either 1 or -1,
the stronger the correlation. If the correlation
is 0 or very close to 0, there is no association
between the two variables. Here, we have a
moderate correlation (r -.392).
89- The DIRECTION of the correlation tells us how the
two variables are related. If the correlation is
positive, the two variables have a positive
relationship (as one increases, the other also
increases). If the correlation is negative, the
two variables have a negative relationship (as
one increases, the other decreases). Here, we
have a negative correlation (r -.392). As
self-esteem increases, anxiety decreases.
90Example
- The following are the numbers of hours which ten
students studied for an examination and the
grades which they received -
Number of hour studied grade in examination
9 5 11 13 10 5 18 15 2 8 56 44 79 72 70 54 94 85 33 65
Is there any relationship between number of
our studied and grade in examination
91Output SPSS
92Exercise 9
- The following table shows the twelve weeks sales
of a downtown department store, x, and its
suburban branch, y - X 71 64 67 58 80 63 69 59 76 60 66 55
- Y 49 31 45 24 68 30 40 37 62 22 35 19
- Is there any significant relationship between x
and y?
93Two way chi-square from frequencies
- A chi-square test is a non-parametric test for
nominal (frequency data). - The test will calculate expected values for each
combination of category codes based on the null
hypothesis that there is no association between
the two variables.
94Procedure
- 1. You will need two columns of codes. Each value
in each column provides a code to a group or
criteria category within the appropriate
variable. You should have one row for each
combination of category code. - 2. You will also need a column giving the
frequency that each combination of codes is
observed. - Before carrying out your chi-square test you
first need to tell SPSS that the numbers in your
frequency column are indeed frequencies. You do
this using weight cases... - 3. Select Weight Cases from the Data menu.
- 4. Click the Weight cases by button.
- 5. Select the column containing your frequencies
and click on the across arrow.
95- Click Crosstabs from the Analyze - Descriptive
Statistics menu. - 8. Select the first variable and click on the top
arrow to move it into the Rows box. - 9. Select the second variable and click on the
middle arrow to move it into the Columns box. - Click on Statistics to choose to perform a
chi-square test on your data. - 11. Select the chi-square option from the
Crosstabs Statistics dialogue box. - 12. Click on Continue when ready.
- 13. Click on Cells to choose to output the
chi-square expected values. - 14. Select the top left boxes to display both the
Observed and the Expected values
96Two way chi-square from raw data
- 1. You will need two columns of codes. Each value
in each column provides a code to a group or
criteria category within the appropriate
variable. - 2. Click Crosstabs from the Analyze - Descriptive
Statistics menu. - 3. Select the first variable and click on the top
arrow to move it into the Rows box. - 4. Select the second variable and click on the
middle arrow to move it into the Columns box. - Click on Statistics to choose to perform a
chi-square test on your data. - 6. Select the chi-square option from the
Crosstabs Statistics dialogue box
97Example
- Suppose we want to investigate whether there is a
relationship between the intelligence of
employees who have through a certain job training
program and their subsequent performance on the
job. - A random sample of 50 cases from files yielded
the following results
98Performance
Poor fair
good
8 8 3
5 10 7
1 3 5
Below average Average Above average
IQ
Test at the 0.01 level of significance whether on
the job performance of persons who have gone
through the training program is independent of
their IQ
99Exercise 10
- Suppose that a store carries two different
brands, A and B, of a certain type of breakfast
cereal. During a one-week, 44 packages were
purchased and the results shows below - brand A brand B
- Men 9 6
- Women 13 16
- Test the hypothesis that the brand purchased and
the sex of the purchaser are independent.