Statistics II: An Overview of Statistics - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Statistics II: An Overview of Statistics

Description:

Statistics II: An Overview of Statistics The Normal Distribution Sampling Distribution Hypothesis Testing Bivariate Statistics Bivariate Regression and Correlation ... – PowerPoint PPT presentation

Number of Views:197
Avg rating:3.0/5.0
Slides: 63
Provided by: dtin6
Category:

less

Transcript and Presenter's Notes

Title: Statistics II: An Overview of Statistics


1
Statistics II An Overview of Statistics
2
Outline for Statistics II Lecture SPSS Syntax
Some examples. Normal Distribution Curve.
Sampling Distribution Hypothesis Testing
Type I and Type II Errors Linking Z to, alpha,
and hypothesis testing. Bivariate Measures of
Association Bivariate Regression/Correlation
3
The Normal Distribution
4
The standard normal distribution a bell-shaped
symmetrical distribution having a mean of 0 and a
standard deviation of 1.
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Z scores. A z score (or standard score) a
transformed score expressed as a deviation from
an expected value that has a standard deviation
as its unit of measurement. A standard score
belonging to the standard normal distribution.
Y-µ z s
15
Sampling Distribution
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
That is, the spread of the sampling distribution
depends on the sample size n, and the spread of
the population distribution. As the sample size
n increases the standard error decreases. The
reason for this is that the denominator of the
ratio increases as n increases, whereas the
numerator is the population standard deviation,
which is a constant and is not dependent on the
value of n.
20
Central Limit Theorem For random sampling, as
the sample size n grows, the sampling
distribution of approaches a normal
distribution. The approximate normality of the
sampling distribution applies no matter what the
shape of the population distribution.
21
(No Transcript)
22
(No Transcript)
23
Hypothesis Testing
24
Steps of a Statistical Significance Test. 1.
Assumptions Type of data, form of population,
method of sampling, sample size 2.
Hypotheses Null hypothesis, Ho (parameter value
for no effect) Alternative hypothesis, Ha
(alternative parameter values) 3. Test
statistic Compares point estimate to null
hypothesized parameter value 4.
P-value Weight of evidence about Ho smaller P
is more contradictory
25
5. Conclusion Report P-value Formal decision
26
Alpha or significance levels The a - level is
a number such that one reject if Ho if the
P-value is less than or equal to it. The a -
level is also called the significance level of
the test. The most common a - levels are .05 and
.01.
27
Type I and Type II Errors A Type I error
occurs when Ho is rejected, even though it is
true. A Type II error occurs when Ho is not
rejected, even though it is false.
28
(No Transcript)
29
(No Transcript)
30
Bivariate Statistics
31
PROPORTIONAL REDUCTION IN ERROR (PRE) all good
measures of association use a proportionate
reduction in error (PRE) approach. The PRE
family of statistics is based on comparing the
errors made in predicting the dependent variable
with knowledge of the independent variable, to
the errors made without information about the
independent variable. In other words, PRE
measures indicate how knowing the values of the
independent variable (first variable) increase
our ability to accurately predict the dependent
variable (second variable).
32
Error without Error with decision rule
- decision rule PRE statistic
_____________________________
Error without decision rule
33
Another way of stating this is E1 - E2 PRE
value _____ E1 Where E1
number of errors made by the first prediction
method. E2 number of errors made by the
second prediction method.
34
PRE measures are more versatile and more
informative than are the chi-square-based
measures. All PRE measures are normed they
use a standardized scale where the value 0 means
there is no association and 1 means there is
perfect association. Any value between these
extremes indicates the relative degree of
association in a ratio comparison sense. E.g., a
PRE measure with a value of .50 represents an
association that is twice as strong as one that
has a PRE value of .25. The number of cases,
the table size, and the variables being measured
do not interfere with the interpretation that can
be given to them.
35
Chi Square The Chi-square test examines whether
two nominal variables are associated. It is NOT
a PRE measure. The chi-square test is based on a
comparison between the frequencies that are
observed in the cells of a cross-classification
table and those that we would expect to observe
if the null hypothesis were true. The hypotheses
for the chi-square are Ho the variables are
statistically independent. Ha the variables are
statistically dependent.
36
Goodman and Kruskas Gamma (G) A measure of
association for data grouped in ordered
categories. G is a PRE measure. G compares two
measures of a prediction 1st it randomly
predicts all untied scores to be either in
agreement or disagreement. 2nd it predicts all
untied pairs to be of the same type. Agreement
or disagreement is determined by the direction of
the bivariate distribution. For a positive
pattern we expect untied pairs to be in
agreement For a negative pattern we expect
untied pairs to be in disagreement.
37
Pa we find the number of agreement pairs by
multiplying the frequency for each cell by the
sum of the frequencies from all cells that are
both to the right and below it. Pd is found by
multiplying the frequency for each cell in the
table by the sum of the frequencies from all
cells that are both to the left and below it.
38
Bivariate Regression and Correlation
39
BIVARIATE REGRESSION AND CORRELATION WHY AND
WHEN TO USE REGRESSION/CORRELATION? WHAT
DOES REGRESSION/CORRELATION MEAN?
40
You should be able to interpret The least
squares equation. R2 and Adjusted R2 F and
significance. The unstandardized regression
coefficient. The standardized regression
coefficient. t and significance. The 95
confidence interval. A graph of the regression
line.
41
ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION NOR
MALITY OF VARIANCE IN Y FOR EACH VALUE OF X For
any fixed value of the independent variable X,
the distribution of the dependent variable Y is
normal. NORMALITY OF VARIANCE FOR THE ERROR
TERM The error term is normally distributed.
(Many authors argue that this is more important
than normality in the distribution of Y). THE
INDEPENDENT VARIABLE IS UNCORRELATED WITH THE
ERROR TERM
42
ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION
(Continued) HOMOSCEDASTICITY It is assumed that
there is equal variances for Y, for each fixed
value of X. LINEARITY The relationship between
X and Y is linear. INDEPENDENCE The Ys are
statistically independent of each other.
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com