Processing Data Iteratively - PowerPoint PPT Presentation

1 / 169
About This Presentation
Title:

Processing Data Iteratively

Description:

Chapter 13 Processing Data Iteratively Section 13.1 DO-Loop Processing Objectives Describe iterative DO loops. Use DO loops to generate data. Use DO loops to ... – PowerPoint PPT presentation

Number of Views:160
Avg rating:3.0/5.0
Slides: 170
Provided by: KathyK91
Category:

less

Transcript and Presenter's Notes

Title: Processing Data Iteratively


1
Chapter 13
  • Processing Data Iteratively

2
Section 13.1
  • DO-Loop Processing

3
Objectives
  • Describe iterative DO loops.
  • Use DO loops to generate data.
  • Use DO loops to eliminate redundant code.
  • Use DO-loop processing to conditionally execute
    code.

4
DO-Loop Processing
  • Statements within a DO loop execute for a
    specific number of iterations or until a specific
    condition stops the loop.

DATA statement ltadditional SAS statementsgt DO
statement iterated SAS statements END
statement ltadditional SAS statementsgt RUN
statement
5
DO-Loop Processing
  • You can use DO loops to
  • perform repetitive calculations
  • generate data
  • eliminate redundant code
  • execute SAS code conditionally.

6
Repetitive Coding
  • Compare the interest for yearly versus quarterly
    compounding on a 50,000 investment made for one
    year at 7.5 percent interest.
  • How much money will a person accrue in each
    situation?

7
Repetitive Coding
  • data compound
  • Amount50000
  • Rate.075
  • YearlyAmountRate
  • Quarterly((QuarterlyAmount)Rate/4)
  • Quarterly((QuarterlyAmount)Rate/4)
  • Quarterly((QuarterlyAmount)Rate/4)
  • Quarterly((QuarterlyAmount)Rate/4)
  • run

8
Repetitive Coding
  • proc print datacompound noobs
  • run

PROC PRINT Output
Amount Rate Yearly Quarterly 50000
0.075 3750 3856.79
What if you want to determine the quarterly
compounded interest after a period of 20 years
(80 quarters)?
9
DO Loop Processing
data compound(dropi) Amount50000
Rate.075 YearlyAmountRate do i1 to
4 Quarterly((QuarterlyAmount)Rate/4)
endrun
10
The Iterative DO Statement
  • The iterative DO statement executes statements
    between DO and END statements repetitively, based
    on the value of an index variable.

DO index-variablespecification-1
lt,specification-ngt ltadditional SAS
statementsgtEND
  • index-variable
  • names a variable whose value governs execution of
    the DO loop.
  • Index-variable is required.
  • Unless dropped, it is included in the data
    setthat is being created.

specification-1specification-n represents a
range of values or a list of specific values.
specification denotes an expressionor a series
of expressions. The iterative DO statement
requires at least one specification argument
11
The Iterative DO Statement
DO index-variablestart TO stop ltBY incrementgt
  • start specifies the initial value of the index
    variable.
  • stop specifies the ending value of the index
    variable.
  • increment optionally specifies a positive or
    negative number to control the incrementing of
    index-variable. If increment is not specified,
    the index variable is increased by 1.

12
The Iterative DO Statement
DO index-variablestart TO stop ltBY incrementgt
  • The values of start, stop, and increment
  • must be numbers or expressions that yield numbers
  • are established before executing the loop.
  • Any changes to the values of stop or increment
    made within the DO loop do not affect the number
    of iterations.
  • Avoid changing the value of the index variable
    within the DO loop to prevent an infinite loop.

13
The Iterative DO Statement
  • What are the values of each of the four index
    variables?

Out of range
do i1 to 12 do j2 to 10 by 2 do k14 to 2
by 2 do m3.6 to 3.8 by .05
1 2 3 4 5 6 7 8 9 10 11 12 13
Out of range
2 4 6 8 10 12
Out of range
14 12 10 8 6 4 2 0
Out of range
3.60 3.65 3.70 3.75 3.80 3.85
...
14
The Iterative DO Statement
DO index-variableitem-1 lt,item-ngt
  • item-1 through item-n can be either all numeric
    or all character constants, or they can be
    variables.
  • Enclose character constants in quotation marks.
  • The DO loop is executed once for each value in
    the list.

15
The Iterative DO Statement
  • How many times will each DO loop execute?

do Month'JAN','FEB','MAR'do
Fib1,2,3,5,8,13,21do iVar1,Var2,Var3do
jBeginDate to Today() by 7 do kTest1-Test50
3 times
7 times
3 times
Unknown. The number of iterations dependson the
values of BeginDate and Today().
One time. A single value of k is determinedby
subtracting Test50 from Test1.
...
16
DO Loop Logic
DO loops iterate within the normal looping
process of the DATA step.
Execute READ statement.
NO
Execute program statements.
...
17
YES
Execute additional program statements.
NO
Output observation to the SAS data set.
Execute statements in the loop.
INDEX INDEX increment
...
18
DO Loop Logic
  • DO loops iterate within the normal looping
    process of the DATA step.

Initialize PDV.
YES
Execute READ statement.
...
19
Performing Repetitive Calculations
  • On January 1 of each year, 5,000 is invested in
    an account. Determine the value of the account
    after three years based on a constant annual
    interest rate of 7.5 percent.

data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
end run
20
Repetitive Calculations Compilation
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
...
21
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
Initialize PDV to missing
...
22
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
Is Year out of range?
...
23
Repetitive Calculations Execution
0 5000
...
24
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
5000 (5000 .075)
...
25
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
...
26
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
Is Year out of range?
...
27
Repetitive Calculations Execution
5375 5000
...
28
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
10375 (10375 .075)
...
29
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
...
30
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
Is Year out of range?
...
31
Repetitive Calculations Execution
11153.135000
...
32
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
16153.13 (16153.13 .075)
...
33
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
...
34
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
Is Year out of range?
...
35
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
...
36
Repetitive Calculations Execution
data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
endrun
37
Performing Repetitive Calculations
proc print datainvest noobs run
PROC PRINT Output
Year Capital 2004 17364.61
Why is Year 2004?
38
Performing Repetitive Calculations
proc print datainvest noobs run
PROC PRINT Output
Year Capital 2004 17364.61
Why is Year 2004? Year was incremented to
2004, which was then out of range. SAS reached
automatic output at the end of the DATA step and
output 2004.
39
Exercise
  • This exercise reinforces the concepts discussed
    previously.

40
Exercises
Leonardo Fibonacci, who was born in the 12th
century, studied asequence of numbers with a
rule for determining the next numberin a
sequence. 1, 1, 2, 3, 5, 8, 13, 21, 34 The
Rule Take a number and add it to the number in
front of it to get the next number. 011,
112, 123, 235 You want to know the 31st
Fibonacci value. Hint 1 is both the first and
second Fibonacci number. Create three variables
Number the actual Fibonacci numberPriorNumber
the previous number in the sequence (initially
set to 0) Fib the index value of your DO LOOP.
41
Exercises
Output
The SAS System Number 1346269
42
Exercises
data fibonacci drop Fib priornumber
Number 1 PriorNumber 0 do Fib 1 to
30 number number priornumber
priornumber number - priornumber
end run proc print datafibonacci noobs run
43
Performing Repetitive Calculations
  • What if you want to see one row for each year?

data invest do Year2001 to 2003
Capital5000 Capital(Capital.075)
end run proc print datainvest noobs run
Year Capital 2004 17364.61
44
Performing Repetitive Calculations
Generate a separate observation for each year.
  • data invest
  • do Year2001 to 2003
  • Capital5000
    Capital(Capital.075)
  • output
  • end
  • run
  • proc print datainvest noobs
  • run

The OUTPUT statement within the DO loop writes
one observation for each of the three iterations
of the DO loop.
45
Performing Repetitive Calculations
  • PROC PRINT Output

Year Capital 2001
5375.00 2002 11153.13 2003
17364.61
Why is the value of Year not equal to 2004 in the
last observation?
46
Performing Repetitive Calculations
Generate a separate observation for each year.
  • data invest
  • do Year2001 to 2003
  • Capital5000
  • Capital(Capital.075)
  • output
  • end
  • run

The OUTPUT statement within the DO loop writes
one observation for each of the three iterations
of the DO loop.
Because there is an OUTPUT statement, SAS will
not execute an automatic output at the RUN
statement.
47
Reducing Redundant Code
  • Recall the example that forecasts the growth of
    each division of an airline.
  • Partial Listing of prog2.growth

Num Division Emps Increase
APTOPS 205 0.075 FINACE 198
0.040 FLTOPS 187 0.080
48
A Forecasting Application (Review)
data forecast set prog2.growth(rename(NumEmps
NewTotal)) Year1 NewTotalNewTotal(1In
crease) output Year2
NewTotalNewTotal(1Increase) output
Year3 NewTotalNewTotal(1Increase)
output run
What if you want to forecast growth over the next
30 years?
49
Reducing Redundant Code
  • Use a DO loop to eliminate the redundant code in
    the previous example.

data forecast set prog2.growth(rename(NumEmps
NewTotal)) do Year1 to 3
NewTotalNewTotal(1Increase) output
end run
50
Reducing Redundant Code
  • proc print dataforecast noobs
  • run

Partial PROC PRINT Output
New Division Total
Increase Year APTOPS 220.38 0.075
1 APTOPS 236.90 0.075
2 APTOPS 254.67 0.075 3 FINACE
205.92 0.040 1
51
Exercise
  • This exercise reinforces the concepts discussed
    previously.

52
Exercises
Obs Number 1 1 2 1 3
2 4 3 5 5 6
8 7 13 8 21 9
34 . . . 27 196418 28 317811
29 514229 30 832040 31 1346269
Modify the program from the previous Fibonacci
exercise. You want to include one row for each
Fibonacci Number.
53
Exercises
data fibonacci drop Fib PriorNumber
Number 1 PriorNumber 0 output do
Fib 1 to 30 number number
priornumber priornumber
number-priornumber output
end run proc print datafibonacci run
54
Conditional Iterative Processing
  • What if you want to forecast the number of years
    that it would take for the size of the Airport
    Operations Division to exceed 300 people?
  • You can use DO WHILE and DO UNTIL statements to
    stop the loop when a condition is met rather than
    when the index variable exceeds a specific value.
  • To avoid infinite loops, be sure that the
    condition specified will be met.

55
The DO WHILE Statement
  • The DO WHILE statement executes statements in a
    DO loop while a condition is true.
  • The expression is evaluated at the top of the
    loop.
  • The statements in the loop never execute if the
    expression is initially false.

DO WHILE (expression) ltadditional SAS
statementsgtEND
The parentheses around the expression are
required.
56
The DO UNTIL Statement
  • The DO UNTIL statement executes statements in a
    DO loop until a condition is true.
  • The expression is evaluated at the bottom of the
    loop.
  • The statements in the loop are executed at least
    once.

DO UNTIL (expression) ltadditional SAS
statementsgtEND
The parentheses around the expression are
required.
57
Conditional Iterative Processing
  • Determine the number of years that it would take
    for an account to exceed 1,000,000 if 5,000
    were invested annually at 7.5 percent.

58
Conditional Iterative Processing
  • data invest
  • do until(Capitalgt1000000)
  • Year1
  • Capital5000
  • Capital(Capital.075)
  • end
  • run
  • proc print datainvest noobs
  • run

59
Conditional Iterative Processing
  • PROC PRINT Output

Capital Year 1047355.91 38
How can you generate the same result with a DO
WHILE statement?
60
Conditional Iterative Processing
  • data invest
  • do while(Capitallt1000000)
  • Year1
  • Capital5000
  • Capital(Capital.075)
  • end
  • run
  • proc print datainvest noobs
  • run

61
Conditional Iterative Processing
  • PROC PRINT Output

Capital Year 1047355.91 38
62
Exercise
  • This exercise reinforces the concepts discussed
    previously.

63
Exercises
Obs Number 1 1 2 1 3
2 4 3 5 5 6
8 7 13 8 21 9
34 . . . 36 14930352 37
24157817 38 39088169 39 63245986 40
102334155
Modify the program from the previous Fibonacci
exercise. You want to have all the Fibonacci
numbers up to 100,000,000.
64
Exercises
data fibonacci drop PriorNumber Number
1 PriorNumber 0 output do until
(number gt 100000000) number number
priornumber priornumber number -
priornumber output end run proc
print data fibonacci run
65
Exercises
data fibonacci drop PriorNumber Number
1 PriorNumber 0 output do until
(number gt 100000000) number number
priornumber priornumber number-
priornumber if number lt 100000000 then
output end run proc print data
fibonacci run
If you do not want to output the row that exceeds
100,000,000, modify the program.
66
Conditional Iterative Processing
What is the result of this DO LOOP?
  • data invest
  • do until(Capital1000000)
  • Year1
  • Capital5000
  • Capital(Capital.075)
  • end
  • run
  • proc print datainvest noobs
  • run

67
Conditional Iterative Processing
What is the result of this DO LOOP?
  • data invest
  • do until(Capital1000000)
  • Year1
  • Capital5000
  • Capital(Capital.075)
  • end
  • run
  • proc print datainvest noobs
  • run

The result is an infinite loop because Capital
can never equal 1,000,000. Use gt.
68
The Iterative DO Statement with aConditional
Clause
  • You can combine DO WHILE and DO UNTIL statements
    with the iterative DO statement.

DO index-variablestart TO stop ltBY incrementgt
WHILE UNTIL (expression) additional
SAS statementsEND
This is one method of avoiding an infinite
loopin DO WHILE or DO UNTIL statements.
69
The Iterative DO Statement with a Conditional
Clause
  • In a DO WHILE statement, the conditional clause
    is checked after the index variable is
    incremented.
  • In a DO UNTIL statement, the conditional clause
    is checked before the index variable is
    incremented.

70
The Iterative DO Statement with a Conditional
Clause
  • Determine the return of the account again.
  • Stop the loop if 25 years is reached or more than
    250,000 is accumulated.

71
The Iterative DO Statement with a Conditional
Clause
data invest do Year1 to 25
until(Capitalgt250000) Capital5000
Capital(Capital.075) endrunproc print
datainvest noobsrun
The loop will stop when Year26 or when Capital
exceeds 250,000.
72
The Iterative DO Statement with a Conditional
Clause
  • PROC PRINT Output

Year Capital 21
255594.86
What stopped the DO loop?
73
The Iterative DO Statement with a Conditional
Clause
  • PROC PRINT Output

Year Capital 21
255594.86
What stopped the DO loop?
The expression (Capitalgt250000)
74
Exercise
  • This exercise reinforces the concepts discussed
    previously.

75
Exercises
Obs Number 1 1 2 1
3 2 4 3 5 5 6
8 7 13 8 21 9
34 . . . 41 165580141
42 267914296 43 433494437 44 701408733
Modify the program from the previous Fibonacci
exercise. You want to have all the Fibonacci
numbers up to 1,000,000,000 or to stop after 44
numbers, whichever comes first.
76
Exercises
data fibonacci drop Fib PriorNumber
Number 1 PriorNumber 0 output do
Fib1 to 43 while (number lt 1000000000)
number number priornumber priornumber
number- priornumber output
end run proc print datafibonacci run
77
Nested DO Loops
  • Nested DO loops are loops within loops.
  • When you nest DO loops,
  • use different index variables for each loop
  • be certain that each DO statement has a
    corresponding END statement.

78
Nested DO Loops
  • Create one observation per year for five years
    and show the earnings if you invest 5,000 per
    year with 7.5 percent annual interest compounded
    quarterly.

79
Nested DO Loops
data invest(dropQuarter) do Year1 to 5
Capital5000 do Quarter1 to 4
Capital(Capital(.075/4)) end
output endrunproc print datainvest
noobsrun
5x
4x
80
Nested DO Loops
  • PROC PRINT Output

Year Capital 1 5385.68
2 11186.79 3 17435.37
4 24165.94 5 31415.68
How can you generate one observationfor each
quarterly amount?
81
Nested DO Loops
  • PROC PRINT Output

Year Capital 1 5385.68
2 11186.79 3 17435.37
4 24165.94 5 31415.68
How can you generate one observationfor each
quarterly amount?
Move the OUTPUT statement inside the Quarter
loop.
82
Nested DO Loops
  • Compare the final results of investing 5,000 a
    year for five years in three different banks that
    compound quarterly. Assume that each bank has a
    fixed interest rate.
  • prog2.Banks

Name Rate Calhoun Bank
and Trust 0.0718 State Savings Bank
0.0721 National Savings and Trust 0.0728
83
Nested DO Loops
  • data invest(dropQuarter Year)
  • set prog2.banks
  • Capital0
  • do Year1 to 5
  • Capital5000
  • do Quarter1 to 4
  • Capital(Capital(Rate/4))
  • end
  • end
  • run

3x
5x
4x
This program is similar to the previous program.
The changes are noted.
...
84
Nested DO Loops
data invest(dropQuarter Year) set
prog2.banks Capital0 do Year1 to 5
Capital5000 do Quarter1 to 4
Capital(Capital(Rate/4)) end
endrun
Partial PDV
...
85
Nested DO Loops
data invest(dropQuarter Year) set
prog2.banks Capital0 do Year1 to 5
Capital5000 do Quarter1 to 4
Capital(Capital(Rate/4)) end
endrun
Partial PDV
...
86
Nested DO Loops
data invest(dropQuarter Year) set
prog2.banks Capital0 do Year1 to 5
Capital5000 do Quarter1 to 4
Capital(Capital(Rate/4)) end
endrun
Partial PDV
87
Nested DO Loops
proc print datainvest noobs run
  • PROC PRINT Output

Name Rate
Capital Calhoun Bank and Trust 0.0718
31106.73 State Savings Bank 0.0721
31135.55 National Savings and Trust 0.0728
31202.91
88
Exercise
  • This exercise reinforces the concepts discussed
    previously.

89
Exercises
Several students are participating in one of
three organized trips with their classmates. The
Savings data set contains the student names, the
trip they want to take, and the deposit they made
for the trip. Students have three more months to
make payments before the final payment is due.
Each payment is 20 percent of the total cost of
the trip. Costs The Ski trip 700
Beach trip 600 Washington D.C.
900 Create a data set called SavingsPlan. The
data set will contain four rows for each student,
one row for each month. Create the following
variables TotalCost the cost of the trip,
depending on the student choice Month numeric
index variable for the DO loop TotalPaid
initially the value of the deposit, but each
month is increased by 20 of the total cost.
(Notice that TotalPaid should not be more than
TotalCost.) FinalPayment the final payment due
the fourth month Create a listing report by
student ID.
90
Exercises Partial Output
--------------------------- StudentID1005
---------------------------
Trip Total Total Final
Name Choice Deposit Cost Paid
Month Payment Chaz Richardson Beach
235 600 355 1 . Chaz
Richardson Beach 235 600 475
2 . Chaz Richardson Beach 235
600 595 3 . Chaz Richardson
Beach 235 600 600 4
5 --------------------------- StudentID1154
---------------------------
Total Total Final
Name TripChoice Deposit Cost
Paid Month Payment Barbara Muir Washington
D.C. 35 900 215 1 .
Barbara Muir Washington D.C. 35 900
395 2 . Barbara Muir Washington
D.C. 35 900 575 3 .
Barbara Muir Washington D.C. 35 900
900 4 325 ---------------------------
StudentID1155 ---------------------------
Total Total
Final Name TripChoice
Deposit Cost Paid Month Payment Angel
Reisman Washington D.C. 95 900 275
1 . Angel Reisman Washington D.C.
95 900 455 2 . Angel Reisman
Washington D.C. 95 900 635 3
. Angel Reisman Washington D.C. 95
900 900 4 265
91
Exercises
data savingsplan set prog2.savings if
tripchoice 'Beach' then TotalCost 600
else if tripchoice 'Washington D.C.'
then TotalCost 900 else if tripchoice
'Skiing' then TotalCost 700
TotalPaid Deposit do Month 1 to 3 while
(TotalPaid lt TotalCost) TotalPaid
TotalPaid (TotalCost.2) if TotalPaid gt
TotalCost then TotalPaidTotalCost
output end FinalPayment TotalCost-
TotalPaid TotalPaid TotalCost output
format Deposit TotalPaid TotalCost
FinalPayment dollar5. run proc print
datasavingsplan noobs by studentid run
92
Exercise Section 13.1
  • This exercise reinforces the concepts discussed
    previously.

93
Section 13.2
  • SAS Array Processing

94
Objectives
  • Describe the concepts of SAS arrays.
  • Use SAS arrays to perform repetitive
    calculations.

95
Performing Repetitive Calculations
  • Employees contribute an amount to charity every
    quarter. The SAS data set prog2.donate contains
    contribution data for each employee.
  • The employer supplements each contribution by 25
    percent.
  • Calculate each employee's quarterly contribution
    including the company supplement.
  • Partial Listing of prog2.donate

ID Qtr1 Qtr2 Qtr3 Qtr4 E00224
12 33 22 . E00367 35 48
40 30
96
Performing Repetitive Calculations
data charity set prog2.donate
Qtr1Qtr11.25 Qtr2Qtr21.25
Qtr3Qtr31.25 Qtr4Qtr41.25 run
proc print datacharity noobs run

97
Performing Repetitive Calculations
  • Partial PROC PRINT Output

ID Qtr1 Qtr2 Qtr3
Qtr4 E00224 15.00 41.25 27.50
. E00367 43.75 60.00 50.00
37.50 E00441 . 78.75 111.25
112.50 E00587 20.00 23.75 37.50
36.25 E00598 5.00 10.00 7.50
1.25
98
Performing Repetitive Calculations
  • What if you want to similarly modify 52 weeks of
    data stored in Week1 through Week52?

data charity set prog2.donate
Qtr1Qtr11.25 Qtr2Qtr21.25
Qtr3Qtr31.25 Qtr4Qtr41.25 run
proc print datacharity noobs run

A DO loop should work, but you are using a
different variable for each calculation.
99
Performing Repetitive Calculations
  • What if you want to similarly modify 52 weeks of
    data stored in Week1 through Week52?

data charity set prog2.donate
do i1 to 52 ltVargt ltVargt1.25
end run proc print datacharity run

You can substitute the variable name using a SAS
array and write the DO loop.
100
Array Processing
  • You can use arrays to simplify programs that
  • perform repetitive calculations
  • create many variables with the same attributes
  • read data
  • compare variables
  • perform a table lookup
  • rotate SAS data sets by making variables into
    observations or observations into variables.

101
What Is a SAS Array?
  • A SAS array is a group of variables whose order
    is important.
  • It is not a data structure, but a name given to a
    group of variables.
  • A SAS array
  • is a temporary grouping of SAS variables that are
    arranged in a particular order
  • is identified by an array name
  • exists only for the duration of the current DATA
    step
  • is not a variable.

102
What Is a SAS Array?
  • SAS arrays are different from arrays in many
    other programming languages.
  • In SAS, an array is not a data structure. It is
    simply a convenient way of temporarily
    identifying a group of variables under one name.

103
What Is a SAS Array?
  • Each value in an array is
  • called an element
  • identified by a subscript that represents the
    position of the element in the array.
  • When you use an array reference, the
    corresponding value is substituted for the
    reference.

104
What Is a SAS Array?




...
105
The ARRAY Statement
  • The ARRAY statement defines the elements in an
    array.
  • These elements will be processed as a group.
  • You refer to elements of the array by the array
    name and subscript.

ARRAY array-name subscript ltgt ltlengthgt
ltarray-elementsgt lt(initial-value-list)gt
106
The ARRAY Statement
array-name specifies the name of the array.
subscript describes the number and arrangement of elements in the array by using an asterisk, a number, or a range of numbers. Subscript is enclosed in braces (), brackets ( ), or parentheses ( () ). Subscript often has the form dimension-size(s). dimension-size(s) is used to indicate a numeric representation of either the number of elements in a one-dimensional array or the number of elements in each dimension of a multidimensional array.
107
The ARRAY Statement
indicates that the elements in the array are character elements. The dollar sign is not necessary if the elements in the array were previously defined as character elements.
length specifies the length of elements in the array that were not previously assigned a length.
array-elements names the elements that make up the array. Array elements can be listed in any order.
(initial-value-list) gives initial values for the corresponding elements in the array. The values for elements can be numbers or character strings. Must enclose all character strings in quotation marks.
108
The ARRAY Statement
  • The ARRAY statement
  • must contain all numeric or all character
    elements. It is a group of variables of one type.
  • must be used to define an array before the array
    name can be referenced.
  • creates variables if they do not already exist in
    the PDV.
  • is a compile-time statement.

109
Defining an Array
  • Write an ARRAY statement that defines the four
    quarterly contribution variables as elements of
    an array.

array Contrib4 Qtr1 Qtr2 Qtr3 Qtr4
ID
QTR4
QTR2
QTR3
QTR1




...
110
Defining an Array
  • Write an ARRAY statement that defines the four
    quarterly contribution variables as elements of
    an array.

array Contrib4 Qtr1 - Qtr4
ID
QTR4
QTR2
QTR3
QTR1




...
111
Defining an Array
  • Variables that are elements of an array need not
    have similar, related, or numbered names.

array Contrib24 Q1 Qrtr2 ThrdQ Qtr4
QTR4
QRTR2
THRDQ
Q1
ID




...
112
Defining an Array
  • Variables that are elements of an array need not
    have similar, related, or numbered names.

array Contrib24 Q1 -- Qtr4
QTR4
QRTR2
THRDQ
Q1
ID




113
Processing an Array
  • Array processing often occurs within DO loops. An
    iterative DO loop that processes an array has the
    following formTo execute the loop as many
    times as there are elements in the array, specify
    that the values of index-variable range from 1 to
    number-of-elements-in-array.

DO index-variable1 TO number-of-elements-in-array
additional SAS statements using
array-nameindex-variableEND
114
Processing an Array
  • You must tell SAS which variable in the array to
    use in each iteration of the loop.
  • You can write programming statements so that the
    index variable of the DO loop is the subscript of
    the array reference.For example,
    array-nameindex-variable).
  • When the value of the index variable changes, the
    subscript of the array reference (and therefore
    the variable that is referenced) also changes.

115
Processing an Array
  • To process particular elements of an array,
    specify those elements as the range of the
    iterative DO statement.
  • By default, SAS includes index-variable in the
    output data set.
  • Use a DROP statement or the DROP data set option
    to prevent the index-variable from being written
    to your output data set.

116
Processing an Array
array Contrib4 Qtr1 Qtr2 Qtr3 Qtr4 do Qtr1 to
4 ContribQtrContribQtr1.25 end
117
Array Processing
  • You can use arrays to simplify programs that
  • perform repetitive calculations
  • create many variables with the same attributes
  • read data
  • compare variables
  • perform a table lookup
  • rotate SAS data sets by making variables into
    observations or observations into variables.

118
Performing Repetitive Calculations
data charity(dropQtr) set prog2.donate
array Contrib4 Qtr1 Qtr2 Qtr3
Qtr4 do Qtr1 to 4
ContribQtrContribQtr1.25 end run
...
119
Performing Repetitive Calculations
data charity(dropQtr) set prog2.donate
array Contrib4 Qtr1 Qtr2 Qtr3
Qtr4 do Qtr1 to 4
ContribQtrContribQtr1.25 end run
Contrib1Contrib11.25


When Qtr1
Qtr1Qtr11.25
...
120
Performing Repetitive Calculations
data charity(dropQtr) set prog2.donate
array Contrib4 Qtr1 Qtr2 Qtr3
Qtr4 do Qtr1 to 4
ContribQtrContribQtr1.25 end run
Contrib2Contrib21.25


When Qtr2
Qtr2Qtr21.25
...
121
Performing Repetitive Calculations
data charity(dropQtr) set prog2.donate
array Contrib4 Qtr1 Qtr2 Qtr3
Qtr4 do Qtr1 to 4
ContribQtrContribQtr1.25 end run
Contrib3Contrib31.25


When Qtr3
Qtr3Qtr31.25
...
122
Performing Repetitive Calculations
data charity(dropQtr) set prog2.donate
array Contrib4 Qtr1 Qtr2 Qtr3
Qtr4 do Qtr1 to 4
ContribQtrContribQtr1.25 end run
Contrib4Contrib41.25


When Qtr4
Qtr4Qtr41.25
123
Performing Repetitive Calculations
  • Partial PROC PRINT Output

proc print datacharity noobs run
ID Qtr1 Qtr2 Qtr3
Qtr4 E00224 15.00 41.25 27.50
. E00367 43.75 60.00 50.00
37.50 E00441 . 78.75 111.25
112.50 E00587 20.00 23.75 37.50
36.25 E00598 5.00 10.00 7.50
1.25
124
Exercise Section 13.2
  • This exercise reinforces the concepts discussed
    previously.
  • Hint Use variable lists when creating your array.

125
Section 13.3
  • Using SAS Arrays

126
Objectives
  • Use SAS arrays to create new variables.
  • Use SAS arrays to perform a table lookup.
  • Use SAS arrays to rotate a SAS data set.

127
Array Processing
  • You can use arrays to simplify programs that
  • perform repetitive calculations
  • create many variables with the same attributes
  • read data
  • compare variables
  • perform a table lookup
  • rotate SAS data sets by making variables into
    observations or observations into variables.

128
Creating Variables with Arrays
  • Calculate the percentage that each quarter's
    contribution represents of the employee's total
    annual contribution.
  • Base the percentage only on the employee's actual
    contribution and ignore the company
    contributions.
  • Partial Listing of prog2.donate

ID Qtr1 Qtr2 Qtr3 Qtr4 E00224
12 33 22 . E00367 35 48
40 30
129
Creating Variables with Arrays
data percent(dropQtr) set
prog2.donate Totalsum(of
Qtr1-Qtr4) array Contrib4
Qtr1-Qtr4 array Percent4
do Qtr1 to 4
PercentQtrContribQtr/Total end
run
The second ARRAY statement creates four numeric
variables Percent1, Percent2, Percent3, and
Percent4.
130
Creating Variables with Arrays
proc print datapercent noobs var ID
Percent1-Percent4 format Percent1-Percent4
percent6. run
Partial PROC PRINT Output
ID Percent1 Percent2 Percent3
Percent4 E00224 18 49 33
. E00367 23 31 26
20 E00441 . 26 37
37 E00587 17 20 32
31 E00598 21 42 32 5
131
Creating Variables with Arrays
  • The PERCENTw.d format
  • multiplies values by 100
  • formats them with the BESTw.d format
  • adds a percent sign () to the end of the
    formatted value
  • encloses negative values in parentheses
  • allows room for a percent sign and parentheses,
    even if the value is not negative.

132
Creating Variables with Arrays
  • Calculate the difference in each employee's
    actual contribution from one quarter to the next.

Partial Listing of prog2.donate
Second difference
ID Qtr1 Qtr2 Qtr3 Qtr4 E00224
12 33 22 . E00367 35 48
40 30
Third difference
First difference
133
Creating Variables with Arrays
data change(dropi) set
prog2.donate array
Contrib4 Qtr1-Qtr4 array Diff3
do i1 to 3
DiffiContribi1-Contribi
end run
...
134
Creating Variables with Arrays
data change(dropi) set
prog2.donate array
Contrib4 Qtr1-Qtr4 array Diff3
do i1 to 3
DiffiContribi1-Contribi
end run
Diff1Contrib2-Contrib1
When i1
Diff1Qtr2-Qtr1
...
135
Creating Variables with Arrays
data change(dropi) set
prog2.donate array
Contrib4 Qtr1-Qtr4 array Diff3
do i1 to 3
DiffiContribi1-Contribi
end run
Diff2Contrib3-Contrib2
When i2
Diff2Qtr3-Qtr2
...
136
Creating Variables with Arrays
data change(dropi) set
prog2.donate array
Contrib4 Qtr1-Qtr4 array Diff3
do i1 to 3
DiffiContribi1-Contribi
end run
Diff3Contrib4-Contrib3
When i3
Diff3Qtr4-Qtr3
137
Creating Variables with Arrays
proc print datachange noobs var ID
Diff1-Diff3 run
Partial PROC PRINT Output
ID Diff1 Diff2 Diff3 E00224 21
-11 . E00367 13 -8
-10 E00441 . 26 1 E00587
3 11 -1 E00598 4 -2
-5
138
Assigning Initial Values
  • Determine the difference between employee
    contributions and last years average quarterly
    goals of 10, 15, 5, and 10 per employee.

data compare(dropi Goal1-Goal4) set
prog2.donate array Contrib4 Qtr1-Qtr4
array Diff4 array Goal4 Goal1-Goal4
(10,15,5,10) do i1 to 4
DiffiContribi-Goali endrun
139
Assigning Initial Values
proc print datacompare noobs var ID Diff1
Diff2 Diff3 Diff4 run
Partial PROC PRINT Output
ID Diff1 Diff2 Diff3 Diff4 E00224
2 18 17 . E00367 25
33 35 20 E00441 . 48
84 80 E00587 6 4 25
19 E00598 -6 -7 1 -9
140
Assigning Initial Values
  • Elements and values are matched by position.
  • If there are more array elements than initial
    values, the remaining array elements are assigned
    missing values and SAS issues a warning.
  • You can separate the values in the initial value
    list with either a comma or a blank space.
  • Initial values are retained until a new value is
    assigned to the array element.

141
Compile
data compare(dropQtr Goal1-Goal4) set
prog2.donate array Contrib4 Qtr1-Qtr4
array Diff4 array Goal4 Goal1-Goal4
(10,15,5,10) do Qtr1 to 4
DiffQtrContribQtr- GoalQtr
end run
Partial Listing of prog2.donate
ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33
22 . E00367 35 48 40 30
PDV
...
142
data compare(dropQtr Goal1-Goal4) set
prog2.donate array Contrib4 Qtr1-Qtr4
array Diff4 array Goal4 Goal1-Goal4
(10,15,5,10) do Qtr1 to 4
DiffQtrContribQtr- GoalQtr
end run
PDV
...
143
data compare(dropQtr Goal1-Goal4) set
prog2.donate array Contrib4 Qtr1-Qtr4
array Diff4 array Goal4 Goal1-Goal4
(10,15,5,10) do Qtr1 to 4
DiffQtrContribQtr- GoalQtr
end run
PDV
ID
QTR3
QTR1
QTR2
QTR4
...
144
data compare(dropQtr Goal1-Goal4) set
prog2.donate array Contrib4 Qtr1-Qtr4
array Diff4 array Goal4 Goal1-Goal4
(10,15,5,10) do Qtr1 to 4
DiffQtrContribQtr- GoalQtr
end run
PDV
DIFF3
DIFF4
...
145
data compare(dropQtr Goal1-Goal4) set
prog2.donate array Contrib4 Qtr1-Qtr4
array Diff4 array Goal4 Goal1-Goal4
(10,15,5,10) do Qtr1 to 4
DiffQtrContribQtr- GoalQtr
end run
PDV
DIFF3
GOAL2
DIFF4
GOAL1
GOAL4
GOAL3
...
146
data compare(dropQtr Goal1-Goal4) set
prog2.donate array Contrib4 Qtr1-Qtr4
array Diff4 array Goal4 Goal1-Goal4
(10,15,5,10) do Qtr1 to 4
DiffQtrContribQtr- GoalQtr
end run
The elements in the Goal array, Goal1, Goal2,
Goal3, and Goal4, are created in the PDV and are
used to calculate the values of Diff1, Diff2,
Diff3, and Diff4. The values are subsequently
excluded from the output data set compare using
the DROP data set option.
PDV
DIFF3
GOAL2
DIFF4
GOAL1
GOAL4
QTR
GOAL3
147
Array Processing
  • You can use arrays to simplify programs that
  • perform repetitive calculations
  • create many variables with the same attributes
  • read data
  • compare variables
  • perform a table lookup
  • rotate SAS data sets by making variables into
    observations or observations into variables.

148
Performing a Table Lookup
  • You can use the keyword _TEMPORARY_ instead of
    specifying variable names when you create an
    array to define temporary array elements.

data compare(dropQtr) set prog2.donate
array Contrib4 Qtr1-Qtr4 array Diff4
array Goal4 _temporary_ (10,15,5,10) do
Qtr1 to 4 DiffQtrContribQtr-GoalQtr
end run
149
Performing a Table Lookup
  • Arrays of temporary elements are useful when the
    only purpose for creating an array is to perform
    a calculation.
  • To preserve the result of the calculation, assign
    it to a variable.
  • Temporary data elements do not appear in the
    output data set
  • Temporary data element values are always
    automatically retained.

150
Performing a Table Lookup
proc print datacompare noobs var ID Diff1
Diff2 Diff3 Diff4 run
Partial PROC PRINT Output
ID Diff1 Diff2 Diff3 Diff4 E00224
2 18 17 . E00367 25
33 35 20 E00441 . 48
84 80 E00587 6 4 25
19 E00598 -6 -7 1 -9
151
Array Processing
  • You can use arrays to simplify programs that
  • perform repetitive calculations
  • create many variables with the same attributes
  • read data
  • compare variables
  • perform a table lookup
  • rotate SAS data sets by making variables into
    observations or observations into variables.

152
Rotating a SAS Data Set
  • You can use array processing to rotate, or
    transpose, a SAS data set. When a data set is
    rotated, the values of an observation in the
    input data set become values of a variable in the
    output data set.
  • The TRANSPOSE procedure is also used to create an
    output data set by restructuring the values in a
    SAS data set, transposing selected variables into
    observations.

ID Qtr1 Qtr2 Qtr3 Qtr4 E00224
12 33 22 . E00367 35 48
40 30
153
Rotating a SAS Data Set
Partial Listing of rotate
ID Qtr Amount
E00224 1 12 E00224 2
33 E00224 3 22 E00224 4 .
E00367 1 35 E00367 2
48 E00367 3 40 E00367 4 30
...
154
Rotating a SAS Data Set
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
155
Execute
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
PDV
...
156
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
Initialize PDV to missing.
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
.
.
.
.
.
.
...
157
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
.
.
.
22
33
12
E00224
...
158
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
.
.
.
22
33
12
E00224
...
159
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
.
1
.
22
33
12
E00224
...
160
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
AmountContrib1
PDV
D
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
.
1
.
22
33
12
E00224
12
...
161
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
12
1
.
22
33
12
E00224
...
162
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
AmountContrib2
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
12
2
.
22
33
12
E00224
33
...
163
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
33
2
.
22
33
12
E00224
...
164
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
AmountContrib3
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
33
3
.
22
33
12
E00224
22
...
165
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
22
3
.
22
33
12
E00224
...
166
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
AmountContrib4
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
22
4
.
22
33
12
E00224
.
...
167
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
.
4
.
22
33
12
E00224
...
168
data rotate(dropQtr1-Qtr4) set
prog2.donate array Contrib4
Qtr1-Qtr4 do Qtr1 to 4
AmountContribQtr output
end run
Implicit return. Continue processing
observations from prog2.donate.
PDV
ID
QTR3
QTR1
QTR2
QTR
AMOUNT
QTR4
.
5
.
22
33
12
E00224
169
Rotating a SAS Data Set
proc print datarotate noobs run
  • Partial PROC PRINT Output

ID Qtr Amount E00224 1
12 E00224 2 33 E00224
3 22 E00224 4 . E00367
1 35 E00367 2 48
E00367 3 40 E00367 4 30
Write a Comment
User Comments (0)
About PowerShow.com