Title: An%20Engineering%20Approach%20to%20Performance
1An Engineering Approach to Performance
- Define Performance Metrics
- Measure Performance
- Analyze Results
- Develop Cost x Performance Alternatives
- prototype
- modeling
- Assess Alternatives
- Implement Best Alternative
2Business in the Internet Age(Business Week, June
22, 1998 numbers in US billion)
3Caution Signs Along the Road(Gross Sager,
Business Week, June 22, 1998, p. 166.)
There will be jolts and delays along the way for
electronic commerce congestion is the most
obvious challenge.
4Electronic Commerce online sales are soaring
- IT and electronic commerce can be expected to
drive economic growth for many years to come. - The Emerging Digital Economy,
- US Dept. of Commerce, 1998.
5What are people saying about Web performance
- Tripods Web site is our business. If its not
fast and reliable, there goes our business., - Don Zereski, Tripods vice-president of
Technology (Internet World) - Computer shuts down Amazon.com book sales. The
site went down at about 10 a.m. and stayed out of
service until 10 p.m. - The Seattle Times, 01/08/98
6What are people saying about Web performance
- Sites have been concentrating on the right
content. Now, more of them -- specially
e-commerce sites -- realize that performance is
crucial in attracting and retaining online
customers. - Gene Shklar, Keynote, The New York Times, 8/8/98
7What are people saying about Web performance
- Capacity is King.
- Mike Krupit, Vice President of Technology,
CDnow, - 06/01/98
- Being able to manage hit storms on commerce
sites requires more than just buying more
plumbing. - Harry Fenik, vice president of technology,
- Zona Research, LANTimes, 6/22/98
8 9Objectives
Performance Analysis
Analysis Computer System
Performance Analyst
Mathematician Computer Systems Person
10You Will learn
- Specifying performance requirements
- Evaluating design alternatives
- Comparing two or more systems
- Determining the optimal value of a parameter
(system tuning) - Finding the performance bottleneck (bottleneck
identification)
11You Will learn (contd)
- Characterizing the load on the system (workload
characterization) - Determining the number and size of components
(capacity planning) - Predicting the performance at future loads
(forecasting)
12Performance Analysis Objectives Involve
- Procurement
- Improvement
- Capacity Planning
- Design
13Performance Improvement Procedure
14Basic Terms
- System Any collection of hardware, software, and
firmware - Metrics The criteria used to evaluate the
performance of the system components - Workloads The requests made by the users of the
system
15Examples of Performance Indexes
- External Indexes
- Turnaround Time
- Response Time
- Through Put
- Capacity
- Availability
- Reliability
16Examples of Performance Indexes
- Internal Indexes
- CPU Utilization
- Overlap of Activities
- Multiprog. Stretch Factor
- Multiprog. Level
- Paging Rate
- Reaction time
17Example I
- What performance metrics should be used to
compare the performance of the following systems. - 1. Two disk drivers?
- 2. Two transaction-processing systems?
- 3. Two packet-retransmission algorithms?
18Example II
- Which type of monitor (software or hardware)
would be more suitable for measuring each of the
following quantities - 1. Number of Instructions executed by a
processor? - 2. Degree of multiprogramming on a timesharing
system? - 3. Response time of packets on a network?
19Example III
- The number of packets lost on two links was
measured for four file size as shown below
Which link is better?
20Example IV
- In order to compare the performance of two cache
replacement algorithms - 1. What type of simulation model should be used?
- 2. How long should the simulation be run?
- 3. What can be done to get the same accuracy
with a shorter run? - 4. How can one decide if the random-number
generator in the simulation is a good generator?
21Example V
- The average response time of a database system
is three seconds. During a one-minute
observation interval, the idle time on a system
was ten seconds. Using a queueing model for the
system, determine the following
22Example V (contd)
- 1. System utilization
- 2. Average service time per query
- 3. Number of queries completed during the
observation interval - 4. Average number of jobs in the system
- 5. Probability of number of jobs in the system
being greater than 1 - 6. 90-percentile response time
- 7. 90-percentile waiting time
23Sample Exercise
- 4.1 The measured throughput in queries / sec for
two database systems on two different workloads
is as follows
Compare the performance of the two systems and
show that a. System A is better b. System B is
better
24The Bank Problem
- The board of directors requires answers to
several questions that are posed to the IS
facility manager. - Environment
- 3 Fully automated branch offices all located in
the same city - ATMs are available at 10 locations through out
the city - 2 mainframes (One used for on-line processing
the other for batch)
25The Bank Problem (contd)
- 24 tellers in the 3 branch offices
- Each teller serves an average of 20 customers /
hr during peak times otherwise, 12 customers /
hr - Each customer generates 2 one-line transactions
on the average - Thus, during peak times, IS facility receives an
average of 960 (24 x 20 x 2) teller originated
transaction / hr
26The Bank Problem (contd)
- ATMs are active 24 hrs / day. During peak
volume, each ATM serves an average of 15
customers / hr. Each customer generates an
average of 1.2 transactions - Thus, during the peak period for ATMs, IS
facility receives an average of 180 (10 x 15 x
1.2) ATM transactions / hr Otherwise ATM
transaction rate is about 50 of this
27The Bank Problem (contd)
- Measured average response time
- Tellers 1.23 seconds during peak hrs
- 3 seconds are acceptable
- ATM 1.02 seconds during peak hr
- 4 seconds are acceptable
28The Bank Problem (contd)
- Board of directors requires answers to several
questions - Will the current central IS facility allow for
the expected growth of the bank while maintaining
the average response time figures at the tellers
and at the ATMs within the stated acceptable
limits? - If not, when will it be necessary to upgrade the
current IS environment? - Of several possible upgrades (e.g. adding more
disks, adding memory, replacing the central
processor boards), which represents the best
cost-performance trade-off? - Should the data processing facility of the bank
remain centralized, or should the bank consider a
distributed processing alternative?
29Response time vs. load for the car rental C/S
system example
30Capacity Planning Concept
- Capacity planning is the determination of the
predicted time in the future when system
saturation is going to occur, and of the most
cost-effective way of delaying system saturation
as long as possible
31Questions
- Why is saturation occurring?
- In which parts of the system (CPU, disks, memory,
queues) will a transaction or job be spending
most of its execution time at the saturation
points? - Which are the best cost-effective alternatives
for avoiding (or, at least, delaying) saturation?
32Capacity planning situation
33Cap. Planning I/O Vars.
34Common Mistakes in Capacity Planning
- 1. No Goals
- 2. Biased Goals
- 3. Unsystematic Approach
- 4. Analysis Without Understanding the Problem
- 5. Incorrect Performance Metrics
- 6. Unrepresentative Workload
- 7. Wrong Evaluation Technique
35Common Mistakes in Capacity Planning (contd)
- 8. Overlook Important Parameters
- 9. Ignore Significant Factors
- 10. Inappropriate Experimental Design
- 11. Inappropriate Level of Detail
- 12. No Analysis
- 13. Erroneous Analysis
- 14. No Sensitivity Analysis
36Common Mistakes in Capacity Planning (contd)
- 15. Ignoring Errors in Input
- 16. Improper Treatment of Outliers
- 17. Assuming No Change in the Future
- 18. Ignoring Variability
- 19. Too Complex Analysis
- 20. Improper Presentation of Results
- 21. Ignoring Social Aspects
- 22. Omitting Assumptions and Limitations
37Capacity Planning of Steps
- Instrument the system
- Monitor system usage
- Characterize workload
- Predict Performance under different alternatives
- Select the lowest cost, highest performance
alternative
38CPU Utilization Example
39Future Predicted CPU Utilization
40Systematic Approach to P. Eval
- 1. State Goals and Define the System
- 2. List Services and Outcomes
- 3. Select Metrics
- 4. List Parameters
- 5. Select Factors to Study
- 6. Select Evaluation Technique
- 7. Select Workload
- 8. Design Experiments
- 9. Analyze and Interpret Data
- 10. Present Results
- 11. Repeat
41Case Study Remote Pipes vs RPC
42Case Study (contd)
- Services Small data transfer or large data
transfer - Metrics
- No errors and failures. Correct operation only
- Rate, Time, Resource per service
- Resource Client, Server, Network
43Case Study (contd)
- This leads to
- 1. Elapsed time per call
- 2. Maximum call rate per unit of time, or
equivalently, the time required to complete a
block of n successive calls - 3. Local CPU time per call
- 4. Remote CPU time per call
- 5. Number of bytes sent on the link per call
44Case Study (contd)
- Parameters
- System Parameters
- 1. Speed of the local CPU
- 2. Speed of the remote CPU
- 3. Speed of the network
- 4. Operating system overhead for interfacing with
the channels - 5. Operating system overhead for interfacing with
the networks - 6. Reliability of the network affecting the
number of retransmissions required
45Case Study (contd)
- Parameters (contd)
- Workload parameters
- 1. Time between successive calls
- 2. Number and size of the call parameters
- 3. Number and size of the results
- 4. Type of channel
- 5. Other loads on the local and remote CPUs
- 6. Other loads on the network
46Case Study (contd)
- Factors
- 1. Type of channel Remote pipes and remote
procedure calls - 2. Size of the network short distance and long
distance - 3. Size of the call parameters small and large
- 4. Number n of consecutive calls 1, 2, 4, 8, 16,
32, , 512, and 1024
47Case Study (contd)
- Note
- Fixed type of CPUs and operating systems
- Ignore retransmissions due to network error
- Measure under no other load on the hosts and the
network
48Case Study (contd)
- Evaluation Technique
- Prototypes implemented ? Measurements
- Use analytical modeling for validation
- Workload
- Synthetic program generating the specified types
of channel requests - Null channel requests ? Resource used in
monitoring and logging
49Case Study (contd)
- Experimental Design
- A full factorial experimental design with
- 23 x 11 88 experiments will be used
- Data Analysis
- Analysis of Variance (ANOVA) for the first three
factors - Regression for number n of successive calls
- Data Presentation
- The final results will be plotted as a function
of the block size n
50Exercises
- 1. From published literature, select an article
or a report that presents results of a
performance evaluation study. Make a list of
good and bad points of the study. What would you
do different, if you wore asked to repeat the
study.
51Exercises (contd)
- 2. Choose a system for performance study.
Briefly describe the system and list - a. Services
- b. Performance metrics
- c. System parameters
- d. Workload parameters
- e. Factors and their ranges
- f. Evaluation technique
- g. Workload
- Justify your choices.
Suggestion Each student should select a
different system such as a network, database,
processor, and so on and then present the
solution to the class.
52Selecting an Evaluation Technique
In all cases, results may be misleading or wrong