Title: Multiprocessing with SAS Software Now
1Multiprocessing with SAS Software Now
Bill Fehlner, Kathleen Wong, Kifah
Mansour SAS Toronto
2Multiprocessing
- Purpose complete a job sooner
- Benchmarks it really works!
- Hidden sources of computing power
- MP CONNECT functionality
- Examples how hard is it?
- How to learn more
3Complete a job sooner
- It takes the same amount of resource, but the
task is done sooner.
4The Purpose of Multiprocessing
- Complete a job in less total elapsed time.
- Increase usage of available CPUs.
5The Practice of Multiprocessing
- In years past, life was simple. Each computer had
one central processing unit (CPU) and was not
connected to another computer. - SAS was created as a single-threaded application,
which means the program executes in a top-down
approach on the one processor.
6The Practice of Multiprocessing
- Today a computer can have multiple processors, or
be part of a network. - Version 8 and previous versions of SAS software
are still single-threaded applications, but now
have the ability through MP CONNECT to take
advantage of additional processors for different
steps.
7The Practice of Multiprocessing
- Divide application into sub-units of work.
FREQ Procedure
TABULATE Procedure
UNIVARIATE Procedure
8What if not every task is independent?
Extract Oracle Data
Merge Data
Read/Summarize SAS Data Set
Read/Summarize Raw Data File
0
elapsed time
9Benchmarks
- SUN Solaris
- IBM RS/6000
- HP 9000
10Benchmark SUN Solaris
- Solaris 7 with twelve 400 MHZ Ultrasparc
processors - 14 SAS steps operating on one table, including
SORT, SUMMARY, FREQ - 480 MB per data set
11Benchmark SUN Solaris
12Benchmark HP 9000
- Six 240 MHZ PA8200 processors
- Transform 6 data sets, interleave the data,
generate reports and write a text file. - 650 MB per data set.
13Benchmark HP 9000
14Benchmark RS/6000
- Four 332 MHZ processors
- Score 3 data sets, generate statistics, and
output three text files containing scores. - 900 MB per data set.
15Benchmark RS/6000
16Hidden Sources of Computing Power
- Under utilized servers
- Workstations after hours
17Hidden Sources of Computing Power
18Hidden Sources of Computing Power
19MP CONNECT functionality
- Start and finish asynchronous sessions
- Communicate between sessions
- Process return codes
- Implement program dependencies
- Manage the SAS sessions
20SAS/CONNECT Monitor Window
The list of tasks is dynamically updated as
new tasks start, and the Status field changes
from Running to Complete, as appropriate.
21NOTIFY YES Option
- displays a message window that indicates the
completion of a task.
22Examples
- Scaling up concurrent sessions on one machine
- Scaling out concurrent sessions on distributed
processors
23Scaling up one machine
- Statements required Whats new in MP CONNECT
- Options . . .
- Autoconnect
- Sascmd
- Rsubmit taskname . . .
- Waitno
- Persist
- Waitfor . . . .
- Signoff taskname
24Scaling up - one machine
Task 1 convert and sort a text file.
Interleave Data
Task 2 convert and sort a text file.
0
elapsed time
25Scaling up one machine
- options SASCMD
- '/bin/SAS/V8/sas.exe -nosyntaxcheck '
- autosignonYES
- rsubmit task1
- waitno
- sysrputsyncyes
- data sample1
- length type1 12 region 13
26Scaling out distributed processing
- Statements required
- Filename rlink . . . 3 diff remote systems
- let taskname . . .
- Signon taskname
- Rsubmit taskname . . .
- Waitno
- Waitfor . . . .
- Signoff taskname
27Asynchronous Execution on Multiple Machines
Server (Remote)
Server (Remote)
28Scaling out distributed processing
- filename rlink '!sasroot\connect\saslink\tcpunix
.scr' - let task1 bcom1
- signon task1 / unix host /
- rsubmit task1
- waitno
- data sample1
- length type1 12 region 13
29How to learn more
- SUGI articles www.sas.com/usergroups/sugi/procee
dings/ - Cheryl Garner, SUGI 25, paper 16
- John E. Bentley, SUGI 26, paper 269
- John E. Bentley, SUGI 27, paper 107
30How to learn more
- SAS courses www.sas.com/service/edu/cantrain/
- Multiprocessing with SAS Software
- SAS Macro Language
- Optimizing SAS Programs
31 Questions?