Title: Inferential Data Analysis
1Chapter 15
- Inferential Data Analysis
2Inferential Statistics
- Inferential statistics are a very crucial part
of scientific research in that these techniques
are used to test hypotheses.
3Uses for Inferential Statistics
- Statistics for determining differences between
experimental and control groups in experimental
research - Statistics are also used in descriptive research
when comparisons are made between different
groups
4- These statistics enable the researcher to
evaluate the effects of an independent variable
on a dependent variable.
5Sampling Error
- Remember when we talked about the sampling error?
- Parameters characteristics of a population
- Statistics characteristics of a sample
6- Differences between a sample statistic and a
population parameter because the sample is not
perfectly representative of the population. - So, maybe the differences among the sample means
may be a real difference, or it may be due to
sampling error.
7- Researcher needs a standard procedure to follow
in making such a decision.
8Hypothesis-testing procedure
9Hypothesis testing
- Hypothesis testing or significance testing is
used to determine whether what you observed in
the sample provides enough evidence to believe
that there is a difference in the population. - In other words the sample difference is said to
be statistically different or not statistically
different.
10Hypothesis Testing
- The Research Hypothesis is transformed into a
Statistical or Null Hypothesis. - This is done so that statistical tests can be
employed that will determine whether the findings
are statistically significant or can be
attributed to chance. - The results of the statistical test will enable
the researcher to accept or reject the null
hypothesis.
11More Hypothesis Testing
- The purpose of the statistical test is to
evaluate the null hypothesis at a specified level
of probability - For instance, testing the difference in the mean
values between 2 groups at the .05 level means
12- Do the values of the dependent variable differ
significantly (plt.05) so that these differences
would not be attributable to chance occurrence
more than 5 times in 100?
13Level of Significance
- Rejecting the null hypothesis at the .05 alpha
(?) level suggests a 95 probability that the
differences between the two variables is real and
not the result of chance.
14- Type I Error rejecting a null hypothesis when it
is really true. - Probability of making a type I error is equal to
? - Type II Error acceptance or not rejecting a null
hypothesis when it is false. - Probability of making a type II error is equal
to ß.
15Hypothesis Testing Procedures
- State the hypothesis (H0), and then you select
the probability level (alpha). - The researcher sets a significance level to
indicate the maximum risks she is willing to take
of making an error when concluding that there is
a difference attributable to the research
situation and not to chance. - Commonly used ? 0.05 or 0.01.
16- Next, you should decide if you are going to use a
one-tailed or a two-tailed test. - In a one-tail test, the 5 area of rejection is
either at the upper end or the lower end.
17To reject the null, the tail used for the
rejection region should cover the extreme
18The t or z scores that are rejected are ones in
the red region or positive values.
19- In a two-tail test, the 5 area of rejection is
split between the upper and lower tails of the
curve null hypothesis is nondirectional.
20(No Transcript)
21- The next thing is that you need to determine if a
parametric test or a nonparametric test should be
used. - Most common mistakes doing a parametric test
when the data are not normally distributed.
22Parametric Nonparametric Tests
- Parametric
- Assume the data are normally distributed.
- That the dependent variables are continuous and
measured on an interval or ratio scale. - Parametric tests are powerful tools make maximum
use of the information available in the data
(e.g. mean, standard deviation).
23Normal curve
- The normal curve is a statistical model that is
used to visualize data, interpret distributions
of scores, and make predictions and probability
statements. - Mean, median, and mode are identical, and makes
up the vertical midpoint. - 95 of the area is between 2 SD.
24Normal Curve
25- Nonparametric
- Make no assumptions about the population under
investigation. - Can be used with nominal or ordinal data.
- Can be used for very small samples, or large
samples which do not fit parametric test
assumptions.
26(No Transcript)
27- Major drawback with these tests is that they are
less powerful than their parametric analogs. - Increased chance of a Type II error less
sensitive to small differences and less able to
detect that such differences might be
statistically significant.
28- Check
- Data is normally distributed?
- Population variances of the groups are
approximately equal? - That the dependent variables are continuous and
measured on an absolute, interval, or ratio
scale? - Large or small sample size?
29- If youre not sure there are a number of
statistical tests to see if your data is
parametric or nonparametric. These tests are
called tests for normality or goodness of fit
test. - Eg Kuipers goodness of fit
- Watsons goodness of fit Liliefors test for
normality - Kolmogorov-Smirnov goodness of fit
30(No Transcript)
31To reiterate yesterdays class
- Hypothesis-testing procedure
- State the hypothesis
- Select the probability level (?)
- Decide if the data are normal or not normal.
- Consult the statistical table.
32Parametric Tests
33t-tests
- Characteristics of t-tests
- requires interval or ratio level scores
- used to compare two mean scores
- easy to compute
- pretty good small sample statistic
34Types of t-test
- One-Group t-test
- t-test between a constant and a sample mean
- Independent Groups t-test
- compares mean scores on two independent samples
35Total and saturated fat intake by year of study.
36Total and saturated fat intake as affected by
major and year of study.
37- Dependent Groups or paired (Correlated) t-test
- compares two mean scores from a repeated measures
or matched pairs design - most common situation is for comparison of
pretest with posttest scores from the same sample
38Analysis of Variance (ANOVA)
- Analysis of Variance compares two or more means
to determine if there are statistically
significant differences among them. - Compares the amount of variation between the
groups with the variation within the groups. - If ANOVA determines that differences exist, data
are then further tested with post hoc test
39- Examples of post hoc tests
- Duncans multiple range test
- Student-Newman-Keuls test (SNK)
- Tukeys Honestly Significant Difference (HSD)
- Scheffes test
- These tests determine how the individual means
differ.
40One-way ANOVA
- Extension of independent groups t-test, but may
be used for evaluating differences among 2 or
more groups.
41Repeated Measures ANOVA
- Each subject is measured on 2 or more occasions
- a.k.a within subjects design.
- Example three Ca dosage, measure 5 times each
42Random Blocks ANOVA
- An extension of the matched pairs t-test when
there are three or more groups or the same as the
matched pairs t-test when there are two groups. - Participants similar in terms of a variable are
placed together in a block, then randomly
assigned to treatment groups another source of
variation that you want to eliminate
43- Example three doses of Ca given to subjects
from ethnic groups. Assign subjects from each
ethnic group to each treatment.
44Factorial ANOVA
- This is an extension of the one-way ANOVA for
testing the effects of 2 or more independent
variables as well as interaction effects. - Two-way ANOVA (e.g., 3 X 2 ANOVA)
- Three-way ANOVA (e.g., 3 X 3 X 2 ANOVA)
- Example 3 doses of Ca (1st variable) and sex of
the subjects (2nd variable).
45Multivariate Analysis of Variance (MANOVA)
- MANOVA determines whether differences exist
between two or more treatments considering
several variables at the same time. - Replaces repeating ANOVA on each variable
- Determines whether treatments differ on balance
considering all variables at once (e.g. blood
chemistry).
46Nonparametric Tests
47The Chi-Square Test
- Used to test the distribution of observations in
a sample among classes - Determines whether a single sample is different
from a hypothetical distribution (e.g., normal
distribution), or whether two or more
distributions differ from each other. - Example incidence of obesity (on a 1-5 scale -
counts) in men and women
48Single Sample Chi-Square
- a.k.a one-way chi-square or goodness of fit
chi-square - Used to test the hypothesis that the collected
data (observed scores) fits an expected
distribution
49Independent Groups Chi-Square
- a.k.a. two-way chi-square or contingency table
chi-square - Used to test if there is a significant
relationship (association) between two nominally
scaled variables - In this test we are comparing two or more
patterns of frequencies to see if they are
independent from each other
50(No Transcript)
51The Mann-Whitney U-Test
- Nonparametric equivalent of a t-test requires no
assumptions about the nature of the data.
52The Wilcoxon Matched-Pair Signed Rank Test
- An nonparametric test equivalent to the t-test
for paired samples - Applied when each sample from one treatment is
matched with a sample from the other treatment
(by subject age, for example)
53The Kruskal-Wallis Test
- The nonparametric equivalent of a one-way or
paired Analysis of Variance
54Finally
55- Conduct the statistical test
56- Accept or reject the null hypothesis.
- If the p value for the value of the statistical
test is less than the alpha level, the null
hypothesis is rejected. - If the opposite is true, the null hypothesis is
accepted.
57Hypothesis Testing Errors
- Remember, the researcher sets a significance
level to indicate the maximum risks she is
willing to take of making an error when
concluding that there is a difference
attributable to the research situation and not to
chance. - Commonly used ? 0.05 or 0.01.
58- Hypothesis testing decisions are made without
direct knowledge of the true circumstance in the
population. As a result, the researchers
decision may or may not be correct.
59Possible Decisions in Statistical Significance
Testing