SAS Basics - PowerPoint PPT Presentation

About This Presentation
Title:

SAS Basics

Description:

SAS Basics – PowerPoint PPT presentation

Number of Views:406
Avg rating:3.0/5.0
Slides: 27
Provided by: sati81
Learn more at: http://www.afn.org
Category:
Tags: sas | basics | tot

less

Transcript and Presenter's Notes

Title: SAS Basics


1
SAS Basics
2
Windows
Program Editor Write/edit all your statement
here.
3
Windows continue
Log Watch this for any errors in program as it
runs
4
Windows continue
  • Output
  • Will automatically pop in front when there is
    output.
  • Does not need to occupy screen space during
    program editing.

5
File Organization
  • Create subfolders in your Project folder for
  • Data
  • Contains SAS datasets, with .sd2 extension
  • Formats
  • Compiled version of formats, a file with .sc2
    extension. Used for building classes of variables
    for looking at frequencies.
  • Output
  • Save output files here. These are text files with
    a .sas extension.
  • Programs
  • All programs are text files with .sas ending.

6
Creating a dataset
  • Internal Data
  • DATA datasetname
  • INPUT name sex age
  • CARDS
  • John M 23
  • Betty F 33
  • Joe M 50
  • RUN

7
Creating a dataset
  • External Data
  • DATA datasetname
  • INFILE c\folder\subfolder\file.txt
  • INPUT name sex age
  • RUN

8
Creating from an existing one
  • DATA save.data2 (keep age income)
  • SET save.data1
  • RUN
  • DATA save.data2
  • SET save.data1
  • DROP age
  • TAX income0.28
  • RUN

9
Permanent Data Sets
  • LIBNAME save c\project\data
  • DATA save.data1
  • X25
  • YX2
  • RUN
  • Note that save is merely a name you make up to
    point to a location where you wish to save the
    dataset called data1. (It will be saved as
    data1.sd2)

10
Whats in my SAS dataset?
  • PROC CONTENTS datasave.data1
  • RUN
  • PROC CONTENTS datasave.data1 POSITION
  • RUN
  • This will organize the variable list sorted
    alphabetically and a duplicate list sorted by
    position (the sequence in which they actually
    exist in the file).

11
Viewing file contents
  • PROC PRINT datasave.data1 run
  • PROC PRINT datasave.data1 (obs5)
  • VAR name age
  • RUN
  • PROC PRINT datasave.data1 (obs12)
  • VAR age -- income
  • RUN

12
Frequencies/Crosstabs
  • PROC FREQ datasave.data1
  • TABLES age income trades
  • RUN
  • PROC FREQ datasave.data1
  • TABLES agesex
  • RUN

13
Scatter Plot
  • PROC PLOT datasave.data1
  • PLOT YX
  • RUN

14
Creating a Format Library
  • PROC FORMAT LIBRARYLIBRARY
  • VALUE BG
  • 0 'BAD'
  • 1 'GOOD'
  • -1 'MISSING'
  • VALUE TWO
  • -1 'MISSING'
  • -2 'NO RECORD'
  • -3 'INQS. ONLY'
  • -4 'PR ONLY'
  • 0'0' 1'1' 1lt-HIGH'2'
  • RUN

15
Applying a format to a variable
  • PROC DATASETS librarysave
  • MODIFY data1
  • FORMAT trades ten.
  • RUN
  • QUIT
  • This applies the format called ten to the
    variable trades. A subsequent PROC FREQ statement
    for trades will show the format applied. Note
    that ten must already exist in the format library
    for this to work.

16
Applying a format Method 2
  • Data save.data2
  • SET save.data1
  • FORMAT
  • trades bktrds ten.
  • totbal mileage.
  • RUN
  • This is another way to apply formats when
    creating a new dataset (data2) from a previous
    one (data1) that has unformatted variables.

17
Random Selection of Obs.
  • DATA save.new
  • SET save.old
  • Random1 RANUNI(254987)100
  • IF Random1 gt 50 THEN OUTPUT
  • RUN
  • QUIT
  • The function RANUNI requires a seed number, and
    then produces random values between 0 and 1,
    stored under the variable name Random1 (you can
    choose any name). The above program will create
    new.sd2, with about half the observations of
    old.sd2, randomly chosen.

18
Sorting and Merging Datasets
  • PROC SORT data save.junk
  • BY Age Income
  • Run
  • PROC SORT datasave.junk OUTsave.neat
  • BY acctnum
  • RUN
  • PROC SORT datasave.junk NODUPKEY
  • BY something
  • RUN

19
Sorting and Merging Datasets
  • PROC SORT datasave.one
  • BY Acctnum RUN
  • PROC SORT datasave.two
  • BY Acctnum RUN
  • DATA save.three
  • MERGE save.one save.two
  • BY Acctnum
  • RUN

20
Sorting and Merging Datasets
  • DATA save.three
  • MERGE save.one (IN a) save.two
  • BY Acctnum
  • IF a
  • RUN

21
Using Arrays
  • DATA save.new
  • SET save.old
  • ARRAY vitamin(6) a b c d e k
  • DO i 1 to 6
  • IF vitamin(i) -5 THEN vitamin(i) .
  • END
  • RUN
  • This assumes you have 6 variables called a, b, c,
    d, e, and ,k in save.old. This program will
    modify all 6 such that any instance of a 5 value
    is converted to a missing value.

22
Simple Correlations
  • PROC CORR datasave.relative
  • VAR tvhours study
  • RUN
  • PROC CORR datasave.relative
  • VAR tvhours study
  • WITH Score
  • RUN

23
Run Regression Analysis
  • Runs the regression and stores the estimates in a
    file called estfile
  • Proc reg datasave.treg2 corr outestestfile
  • bgscore model good
  • trades01
  • trades02
  • ageavg01
  • ageavg02 / selectionnone
  • run
  • Quit

24
Score the data
  • Score the data intreg1 and save the output in
    save.scrdata
  • Proc score datasave.treg1 scoreestfile
    outsave.scrdata
  • typeparms
  • trades01
  • trades02
  • ageavg01
  • ageavg02
  • Run
  • Quit

25
Format bgscore
  • Format the bgscore variable in the new
    save.scrdata file. Find or create a format from
    the format.sas file to apply to the bgscore
    variable.
  • Proc datasets librarysave
  • Modify scrdata
  • Format bgscore insert_format_here.
  • Run
  • Quit

26
Creating Dummy Variables
  • MACRO DUMMY(VAR, FIRST, LAST, TOT)
  • IF(FIRST lt VAR lt LAST) THEN VAR.TOT 1
  • ELSE VAR.TOT 0
  • LABEL VAR.TOT"VAR FIRST - LAST "
  • MEND DUMMY
  • data save.testreg2
  • set save.testreg
  • Dummy(AGEOTD, 0, 78, 1)
  • Dummy(AGEOTD, 96, 119, 2)
  • Dummy(AGEOTD, 120, 143, 3)
  • Dummy(AGEOTD, 144, 179, 4)
  • Dummy(AGEOTD, 180, 99999999, 5)
  • Run
  • Quit
Write a Comment
User Comments (0)
About PowerShow.com