Title: SAS Programming:File Merging and Manipulation
1SAS ProgrammingFile Merging and Manipulation
2Reading External Files (review)
data barf create
the dataset BARF infile s\mysas\Table7.1'
open the data file TABLE7.1 input tr p
a input the
variables run
run program
This SAS program read the data from Table 7.1,
The file Table7.1 was a text file
3Additional Data
- Suppose you collect or acquire additional data
- Need to add this data to existing SAS data set
- Need to combine the data sets
- Can do this many ways
4Methods of Combining Data
- Concatenation
- Match-Merge
5Week 1
Week 1
Week 53
Week 2
Week 2
Week 54
...
...
...
Week 52
Week 52
Week 78
Week 53
Week 54
...
In SAS, this is called Concatenation
Week 78
6Concatenation
- Depends on the nature of the records in each file
- Old and new have the same variables and format
- Old and new have the different variables or
format
7Simplest Case Same Record Layout
data barf1 create
new dataset(BARF1) infile s\mysas\Table7.1'
open the data file TABLE7.1 input tr
p a input the
variables run
run program data barf2
create new dataset(BARF2) infile
s\mysas\Table8.4' open the data
file TABLE8.4 input tr p a
input the variables run
run program data barf3
create dataset BARF3
set barf1 barf2
concatenate proc print
print BARF3 run
run program
81993 X1
1993 Y1
In SAS, this is called Match Merging
9Match Merging
- Uses a BY statement
- Observations are matched according to the
variable specified on the BY statement - Variable specified in BY statement must appear in
all data sets - Before merging, both data sets must be sorted on
the BY variable
10How to Match Merge
- Use DATA statement with MERGE option and BY
option - After Sorting
- DATA new-SAS-data-set
- MERGE old-SAS-data-set1 old-SAS-data-set2
- BY variable
- RUN
11Match Merge Example
- Suppose you have three SAS data files, each with
the following variables - exams.sd2 (SSN EXAM1 EXAM2 EXAM3)
- homework.sd2 (SSN HW1 HW2 HW3)
- paper.sd2 (SSN PAPGR)
- Each record contains observations for one student
enrolled in a class, ECON 100. - Variables are SSN (Social Security Number) ,
EXAMx (scores on 3 exams), HWx (grades on 3
homeworks), PAPGR (grade on paper). - Data files are in s\mysas
12Program for Example
- Merge three SAS data sets into one, Match Merge
on SSN
LIBREF DAT1 s\mysas PROC SORT
DATADAT1.EXAMS BY SSN PROC SORT
DATADAT1.HOMEWORK BY SSN PROC SORT
DATADAT1.PAPER BY SSN RUN DATA
DAT1.ECON100 MERGE DAT1.EXAMS DAT1.HOMEWORK
DAT1.PAPER BY SSN RUN