APPENDIX B Data Preparation and Univariate Statistics - PowerPoint PPT Presentation

About This Presentation
Title:

APPENDIX B Data Preparation and Univariate Statistics

Description:

APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical analysis? – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 21
Provided by: Psych229
Category:

less

Transcript and Presenter's Notes

Title: APPENDIX B Data Preparation and Univariate Statistics


1
APPENDIX BData Preparation and Univariate
Statistics
  • How are computer used in data collection and
    analysis?
  • How are collected data prepared for statistical
    analysis?
  • How are missing data treated in statistical
    analyses?
  • When is it appropriate to delete data before
    they are analyzed?
  • What are descriptive statistics and inferential
    statistics?
  • What determines how well the data in a sample
    can be used to
  • predict population parameters?

2
Preparing Data for Analysis
  • Collecting the data

1. Ask participants to fill out a
questionnaire 2. Ask participants to enter their
response via keyboard into a computer.
  • Analyzing the data

1. SPSS contains a spreadsheet data editor,a
output editor, and a syntax editor. 2. SPSS
contains subprogram to compute the statistical
analyses such as Frequency Distribution,
Descriptive statistics, ANOVA, Correlation,
Regression
  • Entering the data into the computer

1. Use coding systemsgt Label variables. 2. Keep
notesgt You will forget which variable name refers
to which data 3. Save and back up the data 4.
Check and clean the data
3
Missing Data
  • When the respondent has decided not to answer a
    question
  • because it is inappropriate or because the
    respondent has personal
  • reasons for not doing so.

1. Think carefully about whether all questions
are appropriate 2. Save respondents from
embarrassing situations.
  • When the respondent forgot to answer the
    question or completely
  • missed an entire page of the questionnaire.

1. Test the research procedure before you carry
out it 2. Check the respondents answers before
they leaves
  • When the research requires the respondents to
    participate in it
  • at more than one time.

Attrition Problem
4
Deleting and Retaining Data
  • When do we delete variables?

Cases in which the reliability analysis indicates
that the variable did not measure the same
things that other variable measured.
  • When do we delete responses?

Cases in which the respondents gave a very
extreme scoregtoutlier
  • When do we delete participants?

Cases in which the respondents did not understand
the instruction or wasnt able to perform the
task
  • How do we trim the data?

Cases in which the scores that are more than 3
standard deviation above or below the variables
mean.
  • When do we transform the data?

Cases in which you use reverse-score, or you have
skewed data
5
Conducting Statistical Analysis
Descriptive Statistics
Your Data
Statistical approach in which the researcher
summarize the pattern of scores observed on a
measured variable.
Analysis
Inferential Statistics
Population
Statistical approach in which the researcher
infers statistical significance in total
population based on the pattern of scores
observed in your sample of respondents
Your Data
Analysis
6
Summation Notation
Sample data
X1 6 X2 5 X3 2 X4 7 X5 3
6 5 2 7 3 23
Summation Starts from 1
To N (in this case, N 5)
7
Rounding
APA Publication manual generally suggests to
round the presented figures (including both
descriptive and inferential statistics) to two
decimal places.
? 3.14159265
3.14
1.732...
1.73
p 0.0041...
.004
8
Computing Descriptive Statistics
Frequency Distribution
A table that indicates how many, and in most
cases what percentage, of individual in the
sample fall into each of a set of
categories. (e.g. bar chart, grouped frequency
distribution, histogram, frequency curve, stem
and leaf plot)
Central Tendency
The point in the distribution around which the
data are centered. (e.g. mean, median, mode)
Dispersion
The extent to which the scores are all tightly
clustered around the central tendency (e.g.
range, variance, standard deviation)
9
Frequency Distribution
X1 6 X2 5 X3 2 X4 7 X5 3 X6 4 X7
6 X8 2 X9 1 X10 8
Bar Chart
Histogram
Frequency Curve
10
Central Tendency
The Mean (average) the value in which the sum
of all of the scores devided by the sample size.
Sample Data
X1 6 X2 5 X3 2 X4 7 X5 3 X6 4 X7
6 X8 2 X9 1 X10 8



4.4
The Median The score at which half of the
observations are greater and half are smaller.
1, 2, 2, 3, 4, 5, 6, 6, 7, 8
4.5
The Mode the most frequently occurring value
in a variable.
1, 2, 2, 3, 4, 5, 6, 6, 7, 8
11
Dispersion
The Range
The Distance between the largest (the maximum)
and the smallest (the minimum) observed values of
the variable.
S2
The variance
The sum of squares ( sum of (Xi - mean)2 )divided
by N
The Standard Deviation
S
The square root of the variance
12
The variance and the Standard Deviation
Mean Deviation Score
4.4
X1 6 X2 5 X3 2 X4 7 X5 3 X6 4 X7
6 X8 2 X9 1 X10 8
(6 - 4.4) (5 - 4.4) (2 - 4.4) (7 - 4.4) (3 -
4.4) (4 - 4.4) (6 - 4.4) (2 - 4.4) (1 - 4.4) (8 -
4.4)
0
1 2 3 4 5 6 7 8
13
Sum of Squares
(6 - 4.4)2 2.56 (5 - 4.4)2 0.36 (2 - 4.4)2
5.76 (7 - 4.4)2 6.76 (3 - 4.4)2 1.96 (4 -
4.4)2 0.16 (6 - 4.4)2 2.56 (2 - 4.4)2
5.76 (1 - 4.4)2 11.56 (8 - 4.4)2 12.96
SS
50.4
SS
-

244 -
50.4
14
Variance and Standard Deviation
Variance
S2
5.04

Standard Deviation
(SD)
S

2.24
15
Standard Score
(Z score)
The distance of a score from the mean of the
variable expressed in standard deviation unit.
To compare two scores that have different mean
and different standard deviation (SD).
Taro had received a score of 80 on a test. The
average was 50, and standard deviation was
15. Susan had received a score of 75 on a test.
The average was 60, and standard deviation was 10.
50 80
60 75
ZTaro
2.0
2.0
Z
ZSusan
1.5
0 1.5
16
Standard Nominal Distribution
Hypothetical population distribution of standard
scores when the original scores are normally
distributed.
? 0, ? 1
-1 lt Z lt 0, or 0 lt Z lt 1
34.13
-2 lt Z lt -1, or 1lt Z lt 2
13.59
-3 lt Z lt -2, or 2 lt Z lt 3
2.15
0.13
Z gt -3, or 3 lt Z
17
Working with Inferential Statistics
Example. A researcher estimate the average GPA of
all of the psychology majors at UM.
Population
Mean of the population
Descriptive Statistics of 100 students.
M M M M M M M M M M W W W W W W W W W WW W W W
3.40
? ?
S 2.23
Standard deviation of the population
Mean of the sample
Standard deviation of the sample
18
Unbiased Estimator
The sample mean ( ) is an unbiased estimator
of the population mean ?. The sample standard
deviation ( s ) , however, is not an unbiased
estimator of the population standard deviation ?.
How can we estimate ?, using the sample standard
deviation?

S
19
The standard error
If we take all possible samples of N 100 from a
given population, the resulting distribution of
the sample means have ?
The distribution would be normally distributed
with a standard deviation known as standard
error of mean (or simply the standard error).
The standard error is symbolized as S
S

20
Confidence Intervals
The range of scores within which the population
mean is likely to fall.
The exact width of the confidence interval is
determined with a statistic known as Students t
Example. Now, we sampled 100 students.
Degree of freedom 100 - 1 99
If we set alpha .05,
The appropriate t value 1.99 (see Table C,
Appendix E)
Lower limit ? - t(s ) 3.40 - 1.99 ( .22)
2.96 Upper limit ? - t(s ) 3.40 1.99
( .22) 3.84
Write a Comment
User Comments (0)
About PowerShow.com