Title: Introductory Workshop SPSS
1Introductory WorkshopSPSS
- CSU Bakersfield
- April 17, 2009
2Social Science Research and Instructional Council
(SSRIC)
- Discipline council for the social sciences made
up of representatives from each campus in the
CSU. List of campus representatives can be found
at http//www.ssric.org/reps - Promotes use of data analysis in research and
teaching - Website is at http//www.ssric.org
3Social Science Data Bases
- The SSRIC helps maintain and promote the use of
the social science data bases in the CSU - Data bases include
- Inter-university Consortium for Political and
Social Research (ICPSR) - The Field Institute
- The Roper Center for Public Opinion Research
4Agenda for the Introductory SPSS Workshop
- Overview of SPSS
- A brief tour
- Creating youre your own SPSS data file or
opening a data file you got somewhere else - Transforming data
- Recode
- Compute
- Select If
- Univariate analysis
- Frequencies
- Descriptives
- Explore
- A look ahead at the intermediate workshop
5Overview of SPSS
- SPSS is a statistical package for beginning,
intermediate, and advanced data analysis - Other statistical packages include SAS and Stata
- Online statistical packages that dont require
site licenses include SDA
6Text SPSS for WindowsVersion 16 A Basic
Tutorial
- Authors Linda Fiddler (Bakersfield), Laura Hecht
(Bakersfield), Ed Nelson (Fresno), Elizabeth
Nelson (Fresno), Jim Ross (Bakersfield) - Available from McGraw-Hill Custom Publishing.
Call 800-338-3987 to order. Request ISBN
0-07-353833-7 - Available on the web at http//www.ssric.org/trd/s
pss16. The data set for this workshop can be
downloaded at this site
7SPSS Files and Extensions
- Portable file -- .por
- Data file -- .sav
- Output file -- .spo
- Syntax file -- .sps
8Opening SPSS
- Go to start and find SPSS for Windows
- Click on SPSS 16.0 or 17.0 for Windows to open
- Youll need to update your SPSS license every
year (or your school technician will do it for
you)
9A Brief Tour of SPSS(see ch. 1 in text)
- Frequencies -- Analyze/Descriptive
Statistics/Frequencies - Select ABANY and move it to the big box and click
on OK - Crosstabs Analyze/Descriptive
Statistics/Crosstabs - Move ABANY to the Row box
- Move SEX to the Column box
- Click on Cells and select Column percents
- Click on OK
10A Brief Tour Continued
- Comparing means Analyze/Compare Means/Means
- Move AGEKDBRN and EDUC in the Dependent List
box - Move SEX to the Independent List box
- Click on OK
11A Brief Tour Continued
- Correlations
- Analyze/Correlate/Bivariate
- Move EDUC, MAEDUC, and PAEDUC into the
Variables box - Click on OK
12A Brief Tour Continued
- Scatterplots
- Graphs/Legacy Dialogs/Scatter/Dot
- Click on Simple Scatter and then on Define
- Move EDUC into the Y axis box
- Move PAEDUC into the X Axis box
- Click on OK
13Creating Your Own SPSS Data File(see ch. 2 in
text)
- Involves creating
- Variable names
- Variable labels
- Value labels
- Missing values
14Creating a Data File in SPSS
- Questions (see p. 11)
- Age
- Sex
- Religious preference
- Type of marriage preferred
- Opinion on abortion (7 different questions)
15Basic Steps in Creating a Data File
- Assign identification number to each case
- Assign each variable a variable name and an
extended variable label - Each variable will have a set of values. Assign
each value an extended value label - If a variable has missing information, decide
which values will be used as the missing values
16Variable Names
- Traditionally variable names had to be 8
characters or less, start with a letter, and
contain no embedded blanks - Now they can be longer than 8 characters, but
well stick with names of 8 or fewer characters - Names can contain some special characters, but
not all such characters. So we only use hyphens
(-) as special characters in names
17Variable Names
- Age is named AGE
- Sex is named SEX
- Religious preference is named REL
- Political orientation is named C-L
- Preferred marriage is named MG
- There are seven abortion variables and they are
named ABD, ABN, ABH, ABP, ABR, ABS, ABA
18Entering the Information for a Data File
- You already have SPSS open
- Click on File/New/Data
- You should see a blank data screen that looks
like a spreadsheet - At the bottom are two tabs called Data View and
Variable View. Click on Variable View
19Defining the Variables
- Enter the variable names in the Names columns
in the order you want them - Enter the variable labels in the Label column
- Enter the value labels in the Values column.
To do this you will need to click in the
appropriate cell and then click in the little
gray box on the right - Enter the missing values in the Missing column.
To do this you will need to click in the
appropriate cell and then click in the little
gray box on the right
20Adding in the Data
- Now that you have defined the variables, click on
the tab at the bottom called Data View and
enter the data into the appropriate cells. The
data are on p. 18 of the text - Once you have entered the data, go back and check
to make sure you didnt make any data entry
errors - Congratulations!! you created a SPSS data file.
You could also enter the data using a
spreadsheet like Excel
21Saving the Data File
- Now you want to save your data file
- Click on Save as. The default is to save it as
a SPSS data file with .sav as the extension - Give it a file name and indicate where you want
to save it on your hard drive or on your floppy
or flashdrive
22Opening an Existing File You Got Somewhere Else
- Often you will want to open a data set that you
got from someplace else such as - ICPSR
- Field Institute
- Roper Center
- These files will usually be in the form of a
- SPSS portable file (.por)
- SPSS data file (.sav)
- Raw data file with a SPSS syntax file (.sps)
- Raw data file without a syntax file
23Opening a Portable file
- Click on the open yellow folder to open a new
file - Change file type to .por
- Browse to where the portable file you want to
open is located and double click on that file
24Opening an SPSS Data File
- Click on the open yellow folder to open a new
file - Change file type to .sav
- Browse to where the data file you want to open is
located and double click on that file - Were going to use the data set that comes with
the text gss06a.sav. You can download it from
the web site that has the text --
http//www.ssric.org/tr/onlinetextbooks. Look
for the text Right click here to download
GSS06A.
25Opening a Raw Data File with a SPSS Syntax File
- Sometimes you will need to open a raw data file
(ASCII or text) and there will be an accompanying
SPSS syntax file - You will need to modify the File Handle and
Save Outfile commands - See http//www.ssric.org/files/ASCII_to_SPSS.pdf
and http//www.icpsr.umich.edu/cocoon/ICPSR/FAQ/00
62.xml for more information - You may need help doing this. Feel free to
contact me for help
26Opening a Raw Data File Without a SPSS Syntax
File
- If you dont have a SPSS syntax file you will
have to use the codebook that came with the data
and create your own syntax file - You may need help doing this. Feel free to
contact me for help
27Whats Next?
- Now you know how to create a SPSS data file and
how to open an existing SPSS portable or data
file - Next well learn how to transform variables
28Transforming Data(see ch. 3 in text)
- We can transform variables by recoding which
means to combine categories on an existing
variable into fewer categories - We can transform variables by creating new
variables out of existing variables - We can select particular cases and analyze only
these cases - We can do other things like weighting cases that
were not going to talk about in this workshop.
29Recoding Variables
- Recoding into different variables
- Recoding into the same variable
- We recommend recoding into different variables
and not using the into same variable option
30Recoding into Different Variables
- Click on Transform and then on Recode and
then on into different variables - Select the variable you want to recode
- Start by giving the new variable a new name and
assigning a variable label to the new variable.
Click on Change
31Recoding AGE into AGE1
- Recode AGE into four categories and give it the
name of AGE1 - Click on Old and New Values
- Use Range (fourth option down) to recode as
follows. Remember to click on Add after
entering each recode - 18 to 29 1
- 30 to 49 2
- 50 to 69 3
- 70 to 89 4
32Recoding Options
- When you click on Old and New Values there will
be seven options - For most recoding you will only have to use two
of these options - The first option from the top allows you to
recode a single value into a new value - The fourth option from the top allows you to
recode a range of values from X to Y into a new
value
33Assign Value Labels to the Four Categories of
AGE1
- Go into Variable View
- Find the variable AGE1 (should be at the bottom
of the list of variables) - Click in the Values column and then click on
the small gray box - Enter the value labels
- Click on OK
34Exercises for Recoding
- INCOME06 is total family income. Do a frequency
distribution to see what it looks like before
recoding - Recode into 4 categories and call this new
variable INCOME1. Use the following categories
under 20K, 20K to under 40K, 40K to under
60K, and 60K and over - Add the value labels
- Run a frequency distribution for INCOME1 and
check to make sure that you recoded it correctly
by comparing the unrecoded and recoded frequency
distributions
35More Exercises for Recoding
- Now recode INCOME06 again and call the new
variable INCOME2 - This time use 8 categories under 10K, 10K to
under 20K, 20K to under 30K, 30K to under
40K, 40K to under 50K, 50K to under 60K,
60K to under 75K, and 75K and over - Add the value labels
- Run a frequency distribution for INCOME2 and
check to make sure that you recoded it correctly
by comparing the unrecoded and recoded frequency
distributions
36Creating a New Variable with Compute
- Lets create a new variable and call it ABORTION
which is the sum of the seven abortion variables - Click on Transform and then on Compute
- Enter the new variable name (ABORTION) into the
target variable box - Enter the formula for this new variable into the
Numeric Expression box - Click on OK
37Dealing with Missing Data
- If there is missing data for any of these
variables (ABANY to ABSINGLE), the new variable
ABORTION will be assigned a system missing value - What do we do if we want to allow no more than
two missing values? - Lets compute the mean value and divide the sum
of the abortion values by the number of cases
with valid information - But lets allow only two variables with missing
values
38Dealing with Missing Data Continued
- Click on Reset to erase what is currently in
the Compute Variable box - Click on Statistical in the Function Group
box - Then double click on Mean in the Function and
Special Variables box - In the Target Variable box, enter the name of
the new variable. Lets call it ABORMEAN - In the Numeric Expression box, you should see
MEAN(?,?)
39Dealing with Missing Data Continued
- Replace the ?,? with the variables you want to
include so it reads MEAN (abany,abdefect,abhlth,a
bnomore,abpoor,abrape,absingle) - Insert .5 following MEAN so it reads Mean.5.
This indicates that you want to have at least
five variables with valid information - Click on OK
40Exercises for Compute
- There are five variables that measure tolerance
for letting someone speak in your community who
may have different views than your own SPKATH,
SPKCOM, SPKHOMO, SPKMIL, and SPKRAC - For each of these variables, 1 means they would
allow such a person to speak and 2 means they
would not allow it
41Exercises for Compute Continued
- Create a new variable (call it SPEAK) which is
the sum of these five variables - Run a frequency distribution for SPEAK
- What do the values in this new variable tell us?
42More Exercises for Compute
- Now lets create a variable called SPKMEAN which
allows for one of the five variables (SPKATH to
SPKRAC) to be missing - What happens if there is more than one variable
with a missing value? - How does SPSS calculate the new variable if there
is only one variable with a missing value?
43Using Select Cases to Select Specific Cases for
Analysis
- Lets select only Protestants for further
analysis - Click on Data and then on Select Cases
- Click on If condition is satisfied and then on
the If button below it - Select the variable RELIG and move it into the
box on the right - In this box, enter the expression relig 1
- Click on Continue and on OK
44Using Select Cases Continued
- Now lets select Protestants who are under 35
years age old - Enter the expression relig 1 as you did
before. - Use for and. Enter age lt 35 so the
expression reads relig 1 age lt 35 - Click on OK
45Exercises for Select If
- Select all males (1 on the variable SEX) and do a
frequency distribution for the variable FEAR
(afraid to walk alone at night in the
neighborhood) - Now select all females (2 on the variable SEX)
and fun a frequency distribution for FEAR - Are males or females more fearful of walking
alone at night?
46More Exercises for Select If
- Now lets select males under age 35 and run a
frequency distribution for FEAR - Do the same thing for females under 35
- Are males or females under 35 more fearful of
walking alone at night?
47Important Note on Using Select Cases
- When you are finished using Select Cases and
want to revert to using all the cases be sure to
click on Data/Select Cases and select All
cases. Then click on OK - If you dont do this, you will continue to use
only those cases you last selected
48Univariate Analysis
- Now that we know how to open existing files and
transform variables, were ready to begin
analyzing data - Univariate analysis refers to analyzing variables
one-at-a-time
49Types of Univariate Analysis Procedures (see
ch. 4 in text)
- Frequencies
- Descriptives
- Explore
50Frequencies
- Go to Analyze/Descriptive Statistics/Frequencies
- Select ABANY and AGE and click on OK
51Bar Charts
- Bar charts click on Analyze/Descriptive
Statistics/Frequencies - Click on Charts
- Select Bar Charts and click on Continue and
then on OK - Do you think bar charts are appropriate for both
ABANY and AGE?
52Histograms
- Click on click on Analyze/Descriptive
Statistics/Frequencies - Click on Charts
- Select Histograms and click on Continue and
then on OK - Do you think histograms are appropriate for both
ABANY and AGE? - Which do you think is the most appropriate chart
(bar chart or histogram) for ABANY and for AGE?
53Statistics
- Click on Analyze/Descriptive Statistics/Frequencie
s - Click on Statistics
- Select the statistics you want and click on
Continue and then on OK
54Exercises for Frequencies
- There are seven variables dealing with abortion
ABANY, ABDEFECT, ABHLTH ABNOMORE, ABPOOR, ABRAPE,
and ABSINGLE - Run a frequency distribution for each variable
- Get a bar chart for each variable
- Compare and contrast how people answered these
seven questions
55More Exercises for Frequencies
- Run the frequency distribution for AGE
- Get a histogram for AGE
- Compute the following statistics for AGE
- Mean
- Median
- Standard deviation
- Percentiles 25th, 50th, and 75th
56Descriptives
- Click on Analyze/Descriptive Statistics/Descriptiv
es - Select AGE and EDUC
- Click on Options and select the statistics you
want and then click on Continue and OK
57Exercises for Descriptives
- Use Descriptives to compute the following
statistics for AGE - Mean
- Standard deviation
- Variance
- Skewness
- Kurtosis
58More Exercises for Descriptives
- Use Descriptives to compute the mean for EDUC,
MAEDUC, PAEDUC - Who has the most education respondents or their
parents? - Who has the most education mothers or fathers?
59Explore
- Click on Analyze/Descriptive Statistics/Explore
- Select EDUC and put it in the Dependent List
- In the Display box on the lower left, click on
Both - Click on OK
60Selecting Statistics for Explore
- Click on Analyze/Descriptive Statistics/Explore
- Click on Statistics and select the statistics
you want - Click on Continue and then OK
61Selecting Plots for Explore
- Click on Plots
- Select the plots you want
- Click on Continue and then OK
62Exercises for Explore
- Using Explore to get the following statistics and
plots for the variables EDUC, PAEDUC, and MAEDUC - Descriptives
- Outliers
- Stem-and-leaf plot
- Histogram
- Boxplot
- First select Factor levels together and run it
- Then select Dependents together and run it
again - Whats the difference?
63Intermediate Workshop for SPSS
- In the next workshop well look at different
types of statistical analysis you can do in SPSS - Cross tabulations (ch. 5)
- Comparing means (ch. 6)
- Correlation and regression (ch. 7)
- Multivariate analysis (ch. 8)
- Cross tabulations
- Multiple regression
- Presenting your data charts and tables (ch. 9)