Final review - statistics Spring 03 - PowerPoint PPT Presentation

1 / 50

About This Presentation

Title:

Final review - statistics Spring 03

Description:

... two groups do NOT fall in 95% central area of normal distribution of one ... There is no difference between the two variables. ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 51

Provided by: ryokoya

Learn more at: http://www2.hawaii.edu

Category:

more less

Transcript and Presenter's Notes

Title: Final review - statistics Spring 03

1
Final review - statistics Spring 03

Also, see final review - research design

2
Statistics
Descriptive Statistics
Statistics to summarize and describe the data we
collected
Inferential Statistics
Statistics to make inferences from samples to the
populations
3
Descriptive Statistics
A summary of your data
Center / Central Tendencies Indicates a central
value for the variable Measures of Dispersion
(Variability / Spread) Indicate how much each
participants score vary from each other Measures
of Association Indicates how much variables go
together (Shown in Tables, Graphs,
Distributions)
4
Measures of Center

Mode
A value with the highest frequency
The most common value
Median
The middle score
Mean
Average

5
WHY are LEVELS / SCALE of MEASUREMENT IMPORTANT?

Because you need to match the statistic you use
to the kind of variable you have

6
Measures of Central Tendency, Center
Nominal
Ordinal
Interval/Ratio
Mean
7
Summary
Meaningful Zero
Equal Interval
Ratio
Interval
Info of difference among values
Order
Ordinal
Difference
Nominal
Level of Measurement
8
Why Equal Distance Matters?

If the distance between values are equal (as in
interval or ratio data), you are able to
calculate (add, subtract, multiply, divide) values

You can get a mean only for interval/ratio
variables
A wider variety of statistical tests are
available for interval/ratio variables

9
4 5 6 7 8 9 10
What are the Mean, Median, and Mode for this
distribution?
What is this distribution shape called?
10
Types of Measures of Dispersion Variability /
Spread

Frequencies / Percentages
Range
The distance between the highest score and the
lowest score (highest lowest)
Standard deviation /
Variance

11
Variance / Standard Deviation

Variance (S-squared) An approximate average of
the squared deviations from the mean
Standard Deviation(S or SD) Square root of
variance
The larger the variance/ SD is, the higher
variability the data has or larger variation in
scores, or distributions that vary widely from
the mean.

12
(No Transcript)
13
Measures of Dispersion
Nominal
Ordinal
Interval/Ratio
StandardDeviatn, Variance
14
CORRELATION

Co-relation
2 variables tend to go together
Indicates how strongly and
in which direction two variables are correlated
with each other
Correlation does NOT EQUAL cause

15
SIGN

0 No systematic relationship

16
Correlation Co-efficient
Negative
Positive
1
-1
0
17
SIZE

Ranges from 1 to 1
0 or close to 0 indicates NO relationship
/- .2 - .4 weak
/- .4 - .6 moderate
/- .6 - .8 strong
/- .8 - .9 very strong
/- 1.00 perfect
Negative relationships are NOT weaker!

18
Significance Test

Correlation co-efficient also comes with
significance test (p-value)
p.05 .05 probability of no correlation in the
population 5 risk of TYPE I Error 95
confidence level
If plt.05, reject H0 and support Ha at 95
confidence level

19
Inferential Statistics

Infer characteristics of a population from the
characteristics of the samples.
Hypothesis Testing
Statistical Significance
The Decision Matrix

20
Inferential Statistics
Sample Statistics X SD n
Population Parameters m s N
21
Inferential Statistics

assess -- are the sample statistics indicators
of the population parameters?
Differences between 2 groups -- happened by
chance?
What effect do random sampling errors have on our
results?

22
Random sampling error

?Random sampling error
Difference between the sample characteristics
and the population characteristics caused by
chance

Sampling bias
Difference between the sample characteristics
and the population characteristics
caused by biased (non-random) sampling

23
Probability

Probability (p) ranges between 1 and 0
p 1 means that the event would occur in every
trial
p 0 means the event would never occur in any
trial
The closer the probability is to 1, the more
likely that the event will occur
The closer the probability is to 0, the less
likely the event will occur

24
P gt .05 means that

Means of two groups fall in 95 central area of
normal distribution with one population mean

Mean 1
Mean 2
95
25
P lt .05 means that

Means of two groups do NOT fall in 95 central
area of normal distribution of one population
mean, so it is more reasonable to assume that
they belong to different populations

?1
?2
26
Null Hypothesis

Says IV has no influence on DV
There is no difference between the two variables.
There is no relationship between the two
variables.

27
Null Hypothesis

States there is NO true difference between the
groups
If sample statistics show any difference, it is
due to random sampling error
Referred as H0
(Research Hypothesis Ha)
If you can reject H0, you can support Ha
If you fail to reject H0, you reject Ha

Be conservative.
What are chances I would get these results if
null hypothesis is true?
Only if pattern is highly unlikely (p ? .05) do
you reject null hypothesis and support your
hypothesis
Since cannot be 100 sure your conclusion is
correct, you take up to 5 risk.
Your p-value tells you the risk /the
probability of making TYPE I Error

29
True state
Wrong person to marry
Right person to marry
Type II error
You think its the wrong person to marry
Type I error
You think - right person to marry
30
True state
Fire
No fire
Type II error
No Alarm
Type I error
Alarm
31
True State
Fire
Ho (no fire)
Ha
You decide...
Accept Ho (no alarm)
Type II error
Type I error
Reject Ho
Ho null hypothesis there is NO fire
Alarm
Ha alternative hyp. there IS a FIRE
32
Easy ways to LOSE points

Use the word prove
Better to say support the hypothesis or
consistent with the hypothesis
Tentative statements acknowledge possibility of
making a Type 1 or Type 2 error
Use the word random incorrectly

33
Significance Test

Significance test examines the probability of
TYPE I error (falsely rejecting H0)
Significance test examines how probable it is
that the observed difference is caused by random
sampling error
Reject the null hypothesis if probability is lt.05
(probability of TYPE I error
is smaller than .05)

34
Principle Logic
P lt .05
Reject Null Hypothesis (H0)
Support Your Hypothesis (Ha)
35
Logic of Hypothesis Testing
Statistical tests used in hypothesis testing deal
with the probability of a particular event
occurring by chance.
Is the result common or a rare occurrence if
only chance is operating?
A score (or result of a statistical test) is
Significant if score is unlikely to occur on
basis of chance alone.
36
Level of Significance
The Level of Significance is a cutoff point for
determining significantly rare or unusual scores.
Scores outside the middle 95 of a distribution
are considered Rare when we adopt the standard
5 Level of Significance This level of
significance can be written as p .05
37
Decision Rules

Reject Ho (accept Ha) when
the sample statistic is statistically significant
at the
chosen p level, otherwise accept Ho (reject Ha).
Possible errors
You reject the Null Hypothesis when in fact it is
true,
a Type I Error, or Error of Rashness.
You accept the Null Hypothesis when in fact it is
false,
a Type II Error, or Error of Caution.

38
True state
Your decision
?
39
Parametric Tests
To compare two groups on Mean Scores use t-test.
For more than 2 groups use Analysis of Variance
(ANOVA)
Nonparametric Tests
Cant get a mean from nominal or ordinal data.
Chi Square tests the difference in Frequency
Distributions of two or more groups.
40
Parametric Tests

Used with data w/ mean score or standard
deviation.
t-test, ANOVA and Pearsons Correlation r.
Use a t-test to compare mean differences between
two groups (e.g., male/female and
married/single).

41
Parametric Tests

use ANalysis Of VAriance (ANOVA) to compare more
than two groups (such as age and family income)
to get probability scores for the overall group
differences.
Use a Post Hoc Tests to identify which subgroups
differ significantly from each other.

42
When comparing two groups on MEAN SCORES use the
t-test.
43
T-test

If plt.05, we conclude that two groups are drawn
from populations with different distribution
(reject H0) at 95 confidence level

44
Our Research Hypothesis hair length leads to
different perceptions of a person. The Null
Hypothesis there will be no difference between
the pictures.
When comparing two groups on MEAN SCORES use the
t-test.
45
I think she is one of those people who quickly
earns respect.
Short Hair Mean 2.2 SD
1.9 n 100
p .03
Accept Ha
Mean scores come from different distributions.
Long Hair Mean 4.1 SD
1.8 n 100
Accept Ho
Mean scores reflect just chance differences from
a single distribution.
46
In my opinion, she is a mature person.
Short Hair Mean 1.6 SD
1.7 n 100
p .01
Accept Ha
Mean scores come from different distributions.
Long Hair Mean 3.6 SD
1.2 n 100
Accept Ho
Mean scores reflect just chance differences from
a single distribution.
47
I think we are quite similar to one another.
Short Hair Mean 3.7 SD
1.8 n 100
Accept Ha
Mean scores come from different distributions.
p .89
Long Hair Mean 3.9 SD
1.5 n 100
Accept Ho
Mean scores are just chance differences from a
single distribution.
48
A nonsignificant result may be caused by a

A. low sample size.
B. very cautious significance level.
C. weak manipulation of independent variables.
D. true null hypothesis.

49
When to use various statistics

Parametric
Interval or ratio data

Non-parametric
Ordinal and nominal data

50
Chi-Square X2

Chi Square tests the difference in frequency
distributions of two or more groups.
Test of Significance
of two nominal variables or
of a nominal variable an ordinal variable
Used with a cross tabulation table

Write a Comment

User Comments (0)