Data Analysis - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Data Analysis

Description:

Measures the strength of the relationship between the two variables in the table. ... To assess the strength of two ratio level variables. ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 24

Provided by: DonnaH2

Learn more at: http://zimmer.csufresno.edu

Category:

more less

Transcript and Presenter's Notes

Title: Data Analysis

1
Data Analysis

Statistics

2
Levels of Measurement

Nominal Categorical no implied rankings among
the categories. Also includes written
observations and written responses from
qualitative interviews or open-ended survey
questions.
Ordinal Categorical data with implied rankings
or data obtained through respondent ranking of
categories. In some cases, a ranking process may
be set up for a particular variable.
Interval No fixed zero point. Data is
numerical, not categorical. Rank order among
variables is explicit with an equal distance
between points in the data set -2, -1, 0, 1,
2
Ratio Fixed zero point otherwise the same as
interval.

3
In general, type of data can be inferred using
the following the criteria

Nominal Categorical no implied rankings among
the categories. Also includes written
observations and written responses from
qualitative interviews or open-ended survey
questions.
Ordinal Categorical data with implied rankings
or data obtained through respondent ranking of
categories. In some cases, a ranking process may
be set up for a particular variable.
Interval No fixed zero point. Data is
numerical, not categorical. Rank order among
variables is explicit with an equal distance
between points in the data set -2, -1, 0, 1,
2
Ratio Fixed zero point otherwise the same as
interval.
Any categorical data is either nominal or
ordinal.
All qualitative data is nominal.
All scores on standardized scales are either
interval or ratio. (Note almost all the scales
we use in social work, except IQ scores are
ratio).
The level of measurement determines what
statistical method we can use.

4
In some cases, we can covert a variable into
another level of measurement

We can change a variable from ratio to either
ordinal or nominal

5
Coverting Data (Use Recode in SPSS)
6
Advantages of using ratio data

We can covert it to another level of data we
cant do this with nominal data.
People can simply write down information about
how they fit a particular attribute (age,
income).
We have more statistical options with ratio data.
Inferential statistics requires that dependent
variables always be ratio.

7
Primary types of data analysis are

Qualitative
Descriptive. Used to describe the distribution of
a single variable or the relationship between two
nominal variables (mean, frequencies,
cross-tabulation)
Inferential (Used to establish relationships
among variables assumes random sampling and a
normal distribution)
Nonparametric (Used to establish causation for
small samples or data sets that are not normally
distributed)

8
Much of what you will use in your research will
be descriptive statistics.

For example, the most basic type of descriptive
statistic is the frequency. Frequencies are the
number of times a specific value or data within a
specific category occurs.
Most often we convert frequencies to percentages
Formula is f/n, where f frequency and n the
total number of values in a data set. For
example, the if the age 25 occurs 5 times in a
data set of 50 5/50 10.

9
Examples of use of frequency data

40 of respondents are male.
The mean level of income was 35,000
40 of all female voters cast their vote for
Arnold compared to 52 of the male voters.
Note the other descriptive statistic we use is
the standard deviation. It describes the degree
to which data points vary from the mean of a
distribution. In a research article, you will see
the standard deviation included with the mean.

10
Application of Standard Deviation (SD)

Mean income was 35,000 with SD 5,000
M 23,000, SD 500
This is interpreted as there being less
variability in income among members of the second
data set. That is scores are grouped more tightly
around the mean.

11
Normal Distribution

Meanmedianmode
Bell shape curve
50 of scores fall below and 50 fall above the
mean.
Data set can be assessed in terms of how much
data falls within one, two or three standard
deviations from the mean.
Generally is unimodal although some distributions
may be bimodal or trimodal.
Theoretically, at least, inferential statistics
may only be used when a set of scores conform to
a normal distribution. However, this assumption
is often violated.

12
Frequencies used in almost all types of data
analysis. Frequency tables can be formatted in a
variety of ways. (Some analysis add value and
cumulative percent)
13
We can also use tables to determine if there is a
relationship between two nominal variables,
although we can not assess the strength of the
relationship. This is called a cross-tabulation
14
Categories in both Qualitative Analysis must be

Mutually exclusive (no overlap)
Exhaustive (all possible categories should be
included)

15
Cross-tabulation is the basis for chi-square.
Chi-square

Measures the strength of the relationship between
the two variables in the table.
Is not technically a inferential statistic does
not require a normal distribution but is often
grouped with inferential statistics.
Usually requires a random sample although data
collected from everyone in a population group is
usually considered sufficient for a chi-square
analysis.

16
Means can also be used to make comparisons among
groups.
17
You may use means on your project

If your variables include ratio data
If you want to compare groups on a ratio variable
If you want to summarize scores on a standardized
instrument or a likert scale

18
Some inferential statistics look at the strength
of the relationship between mean scores on ratio
level variables and membership in particular
demographic group

T-tests (two group comparisons)
Analysis of variance (compares three or more
groups)
Answers question Is the difference in means
between the two (or more) groups large enough to
be statistically significant?

19
We also use correlations to measure the strength
of a relationship between two variables.
Correlations can only be used

To assess the strength of two ratio level
variables.
To measure associations rather than cause and
effect relationships.
With data sets in which there are 30 or more
observations.

20
Inferential statistics commonly used include

Independent T-test (compares two groups on one
variable). (Test statistic T)
Paired sampled t-test (compares ratio level
scores on pre and post test data). (Test
statistic T)
ANOVA compares three or more groups on ratio
data (Test statistic F)
Correlation measures the association between
two ratio level variables (Test statistic R)
Regression analysis (dependent ratio variable
can include more than one independent variable
(can be a combination of ratio, ordinal, and
nominal data in the regression model). (Test
statistic is R2, F, or partial correlation
coefficients)

21
Inferential Statistics require that we assess the
probability that there is actually a causal
relationship between two variables.

We state the research null hypotheses.
State the degree to which we will risk being
wrong about whether or not a relationship
actually exists between two variables (level of
significance usually under .10)
Choose an appropriate statistical test and
compute it.
Compare the probability level on your computer
print out to the level of significance. If the p.
value is lower than your confidence level, then
reject the null hypothesis. If the p value is
higher than the confidence level, accept the null
hypothesis.

22
For example

There is a positive relationship between scores
on the self-esteem scale and depression. Level of
significance is .05. R .75, p .01. Reject
Null Hypothesis and accept the Research
Hypothesis.
Women will have higher test scores than men.
Level of significance .10. T .30, p. .60.
Accept the Null Hypothesis and Reject the
Research Hypothesis.

23
Other info

Chi-square is interpreted in the same way as
inferential statistics.
Most statistics books contain tables that let you
determine p values if you calculate test
statistics by hand.
SPSS print outs always contain p values for
inferential statistics.
Theoretical assumptions are often violated in
research articles.
Sample size determines if a relationship between
two or more variables is large enough to be
statistically significant.
Relationships between two variables can be either
positive or negative. High positive relationships
are close to 1.00 and high negative
relationships are close to 1.00.