Performance Engineering - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Performance Engineering

Description:

Performance Engineering MEASUREMENT AND STATISTICS Prof. Jerry Breecher * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Measurement and Statistics 10. – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 47
Provided by: JerryBr7
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Performance Engineering


1
Performance Engineering
MEASUREMENT AND STATISTICS
  • Prof. Jerry Breecher

2
Measurement and Statistics
  • In order to get you in the mood for doing some
    measuring, statistics, and estimating, here are
    some quotations with the right flavour
  •  
  •  
  • "Figures don't lie, but liars figure." Mark
    Twain
  •  
  • "There are three kinds of untruths lies, damn
    lies, and statistics." Mark Twain
  •  
  • The following are from "Policy Paradox and
    Political Reason" by Deborah Stone.
  •  
  • "Numerals hide all the difficult choices that go
    into a measurement."
  •  
  • "Certain kinds of numbers, big ones, numbers with
    decimal points, ones not multiples of ten,
    seemingly advertise the prowess of the measurer."
  •  
  • "How accurate a number is depends on the cost of
    acquiring it and on how important it is."
  •  

3
Measurement and Statistics
  • "Numbers are a form of poetry. Symbols are
    another."
  •  
  • "No number is innocent, for it is impossible to
    count without making categorization."
  •  
  • "Every number is a political statement about
    where to draw the line."
  •  
  • "The first number you measure becomes the status
    quo."

4
Measurement and Statistics
  •  
  • Purpose
  •  
  • This section is about the methodology of
    measurement. What goes into designing an
    experiment, gathering some numbers, interpreting
    the results, and presenting those results to
    management in a way that allows them to make the
    necessary decisions.
  •  
  •  
  •  
  • Warm-up Experiment
  •  
  • Divide into teams and measure the length of an
    object in the classroom. To do so you will need
    to make team decisions about tools, techniques,
    and reporting metrics.
  •  
  • Upon completion, discuss what can be learned from
    this experiment.

5
Measurement and Statistics
  •   FUNDAMENTAL QUESTIONS ABOUT MEASUREMENT
  • What kind of accuracy can you expect from a
    computer (or any other) measurement?
  •  
  • When you make a measurement, can you believe the
    result? How sure are you of the result?
  •  
  • How should you state the result of an experiment?
    How do you reflect your belief in its accuracy?
  •  
  • Can one number represent the performance of a
    product?
  •  
  • When have you measured enough?
  • Figures don't lie, but liars figure. How do you
    extrapolate from what you know to what you'd like
    to know?

6
Measurement and Statistics
  •   
  • FUNDAMENTAL QUESTIONS ABOUT MEASUREMENT
  •  
  • How do you know what tools to use?
  •  
  • Is everything in a computer measurable?
  •  
  • How do you know what to measure?
  •  
  • Should you always know the result of a
    measurement before you make it?
  •  
  • How do you figure out dependencies how does one
    variable depend on another?
  • So after all this talk about the details of
    measurement, how do you actually design an
    experiment?

7
Measurement and Statistics
  • 1. What kind of accuracy can you expect from a
    computer (or any other) measurement?
  •  
  • Associated Questions are
  •  
  • What are some sources of uncertainty when
    measuring a computer and its software.
  • Is a computer deterministic? (What is the
    meaning of deterministic? Do a detour on
    predictable, deterministic, stochastic and
    chaotic.)
  • What are the pros and cons of taking all the
    variation out of an environment. Repeatability
    vs. believability.
  •  
  • Here are some factors that lead to experimental
    variation
  •  
  • System/Component/Molecule/Atom how granular is
    the measurement.
  • Background Activity
  • End effects and incomplete cycle effects.
    Measurement error.
  • Randomness doesn't mean equality (stochastic
    process).
  •  
  • Example Travelling around a monopoly board.
  •  
  • Randomness from resource contention ( stochastic
    process ).
  •  

8
Measurement and Statistics
  • 1. What kind of accuracy can you expect from a
    computer (or any other) measurement?
  •  
  • Here are some factors that lead to experimental
    variation (continued)
  •  
  • Changing hardware.
  •  
  • Example Variations in fullness of a disk, CPU
    boards, interrupt traffic.
  •  
  • Tool granularity
  •  
  • Example Our experiment in class.
  •  
  • Example You write a program that measures time
    in seconds. What percentage accuracy can you get
    from your experiment.
  •  
  • Example You want to measure the time required
    to execute a routine and have available a system
    call named get_time_of_day. get_time_of_day
    returns time in units of 1/65535 seconds 16
    microseconds. The time required to execute the
    get_time_of_day routine itself is 100
    microseconds. What is the shortest routine that
    can be measured with this tool? How would you do
    it?
  •  
  • Bottom Line Never believe a real system number
    to better than 5 - 10. Artificial numbers can
    sometimes be repeated to 1 - 2, but are
    susceptible to spurious factors.

9
Measurement and Statistics
  • 2. When you make a measurement, can you believe
    the result? How sure are you of the result? ?
  •  
  • Suppose you make several determinations of some
    measure. If you can answer yes to the following
    questions, then you can have some faith in your
    measurement
  •  
  • Can you explain why the numbers vary?
    (Handwaving isn't allowed here, but
    statistics may be a valid answer.)
  • If variations are greater than 10, can you
    figure out what's causing the variation and could
    you eliminate it if time allowed?
  • If the granularity of your tool is greater than
    the measurement variations, is that acceptable?
    (Your granularity then becomes your uncertainty.)
  •  
  • But How Much Do You Trust It?
  •  
  • To answer this we need a brief digression into
    some math.
  •  
  • Suppose we've taken a number of measurements

10
Measurement and Statistics
  • 2. When you make a measurement, can you believe
    the result? How sure are you of the result? ?
  •  
  • Then the mean and standard deviation are

s s2 variance SD2
The first form of the Standard Deviation is the
form of the underlying data. The second form is
that of the measured data. They are the same for
an infinite amount of data and close enough for a
large set of numbers.   NOTE Use of these
equations assumes that the measurements are
independent of each other.
11
Measurement and Statistics
2. When you make a measurement, can you believe
the result? How sure are you of the result? ?
  • Confidence Intervals
  •  
  • We'd like to say ? I'm p sure that with n
    samples the actual value is within d of the mean
    of the measurements. In this section, we
    develop simple ways to be able to make that
    statement.
  •  
  • Example of Standard Deviations using Normal
    Distributions
  •  
  • By quoting the standard deviation of a
    measurement, we say we're 68 sure the true mean
    is within a standard deviation of the measured
    mean. Unfortunately, that 68 depends on having a
    large number of samples.  For smaller numbers,
    the percentage will change.

Normal distribution showing mean and variance.
12
Measurement and Statistics
2. When you make a measurement, can you believe
the result? How sure are you of the result? ?
  • Distributions Student-T  
  • Both the normal and Student-T distributions
    represent how random data should be found. The
    difference lies in how many samples are taken
    the Normal Distribution assumes a very large
    (like infinite) number of samples, while the
    Student T is for n (less than infinite) samples.
    As you see in the examples on subsequent pages, n
    is used as part of the confidence calculation.

T distribution showing dependence only on number
of samples.
The derivation of the t-distribution was first
published in 1908 by William Sealy Gosset, while
he worked at a Guinness Brewery in Dublin. He was
not allowed to publish under his own name, so the
paper was written under the pseudonym Student.
The t-test and the associated theory became
well-known through the work of R.A. Fisher, who
called the distribution "Student's distribution".
13
Measurement and Statistics
2. When you make a measurement, can you believe
the result? How sure are you of the result? ?
  • The Burns Co. is now making laptop computers in
    its Shelbyville plant. Mr. Burns is too cheap to
    wreck too many computers in a test, so he's
    letting his QA guru, Homer, smash five of them.
    Homer is to record from how high in the air he
    can drop each laptop on the floor before it won't
    work anymore.
  • Mr. Burns' wants laptops that can survive a fall
    from his height of five feet, two The t-test
    will tell us if we can accept that the average
    breaking point for a Burns laptop is greater than
    5'2", given what we know about the sample.
  • Let's say the five computers broke at drops of
  • 4 feet, 8 inches
  • 5 feet, 1 inch
  • 2 feet, 3 inches
  • 6 feet, 10 inches
  • 7 feet, 1 inch

14
Measurement and Statistics
2. When you make a measurement, can you believe
the result? How sure are you of the result? ?
  • Using the formula
  • (avg. of sample) - (presumed avg. of
    larger pop.)
  • t --------------------------------------------
    ------
  • (st. dev. of sample) / (sq. root of sample
    size)
  • we get an average breaking height of 62.2 inches,
    St Dev of 23.4, and a t-score of 0.0191.
  • Let's go to the t-score table. There we find the
    t-value for four degrees of freedom and a
    90-percent confidence interval (that's p.05,
    since taking .05 off each side of the bell curve
    leaves us with .90 in the middle). That value is
    2.13.
  • Since the value we calculated is less than the
    table's t-value, that means we cannot accept the
    assumption that all Burns laptops together have
    an average breaking drop of over 62 inches. Even
    though our sample's average came in (just) over
    that.

15
Measurement and Statistics
  • 2. When you make a measurement, can you believe
    the result? How sure are you of the result? ?
  •  
  • Example Use of Student-T
  •  
  • As part of our ongoing regression test package,
    monitoring the performance of PRODUCT X, we run
    tests that tickle a number of code paths. In
    this table, higher numbers are better - they
    represent the number of transactions completed
    they are throughput.
  •  
  • RESULTS
  •  
  • Model --gt 110 120 130
    140 150 160
  • Product X, Version A 3.25 6.34 9.37a
    11.8b 14.3 16.6c
  • Product X, Version B 3.20 6.30 9.22d
    11.8e 14.4f 16.8
  •  
  • Here are the raw numbers which went into making
    up the averages indicated above
  •  
  • a b c d e
    f
  • 9.36 11.76 16.59 9.21 11.83
    14.40
  • 9.37 11.80 16.59 9.22 11.82
    14.29
  • 9.38 11.79 16.58 9.20 11.85
    14.43
  • 9.35 11.77 16.63 9.22 11.82
    14.36

16
Measurement and Statistics
  • 2. When you make a measurement, can you believe
    the result? How sure are you of the result? ?
  •  
  • Example Use of Student-T
  •  
  • Let's work through in detail the numbers in "f".
    We find the
  •  
  • mean (14.40 14.29 14.43 14.36 14.44
    )/5 14.38
  •  
  • SD SQRT( (.02 .09 .05 .02 .06 )/4
    ) SQRT( 0.00375 ) 0.061
  •  
  • s variance SD2 0.00375
  •  
  •  
  • Suppose we want to find the confidence interval
    for 95 confidence. With 5 variables, we have n
    4 degrees of freedom. Read the table for
    t(0.975) ( there's 2.5 UNconfidence on each side
    of the curve ) giving 2.78.
  •  
  • d t SQRT( s / n ) 2.78 SQRT( 0.00375 /
    5 ) 2.78 0.027 0.075
  •  
  • The number is 14.38 - 0.075 with 95
    confidence. (How should you round off this
    number to accurately reflect your confidence?)

17
Measurement and Statistics
Measurements (Sorted) Mean Standard Deviation lt-- From Excel's Functions
1.9 3.90 0.95
2.7
2.8 1.9 lt-- From Tools-gt
2.8 Data_Analysis-gt
2.8 Mean 3.961290323 Descriptive Statistics
2.9 Standard Error 0.159749631
3.1 Median 3.9 (Note Excel has
3.1 Mode 2.8 eliminated the
3.2 Standard Deviation 0.889448301 outlying value.)
3.2 Sample Variance 0.79111828
3.3 Kurtosis -0.674508941
3.4 Skewness 0.419790238
3.6 Range 3.2
3.7 Minimum 2.7
3.8 Maximum 5.9
3.9 95 Confidence 0.32
4.1
4.1
Etc.
etcetera
  • Example of Normal Distribution
  •  
  • Suppose weve been making measurements as shown
    in the first column in the Table below. By
    inserting those numbers in Excel, the spreadsheet
    will calculate all kinds of things for us
    automatically.

18
Measurement and Statistics
  • 2. When you make a measurement, can you believe
    the result? How sure are you of the result?
  • COMPARING TWO SETS OF MEASUREMENTS
  • Youve just measured the Performance of the
    latest release of your product. The numbers are
    better than they were when you measured them on
    the last release. But what does better mean.
    How do you show that two sets of numbers, with
    lots of uncertainty in each of the sets, really
    have one set better than the other.
  •  
  • First of all, heres the easy way. With your two
    sets, calculate their means and their confidence
    intervals (the confidence you use is up to
    you.) Visually plot these results as show in
    the three examples below

A
B
C
The results are such that the mean of one set is
within the confidence interval of the other set.
The two sets are NOT different.
A.      Here the confidence intervals dont
overlap. The results are different from each
other.
The confidence intervals overlap but the means
are not inside the CI of the other set. Need to
do a more complex test.
19
Measurement and Statistics
  • 2. When you make a measurement, can you believe
    the result? How sure are you of the result?
  • COMPARING TWO SETS OF MEASUREMENTS
  • In essence this is a way to combine the
    confidences for the two data sets so as to
    determine the confidence in the difference
    between the two sets. This is called a t-test.
  •  
  • Excel can do a t-test as shown in the data below

Data Set Data Set 1
2 5.36 19.12 16.57 3.52 0.62
3.38 1.41 2.50 0.64 3.60 7.26
1.74 5.31 5.64 lt-- Average
AVERAGE(A3A8) 6.16 6.64 lt-- Standard
Deviation STDEV(A3A8) 0.465703 lt-- Result
of the t-test says there is a 46 chance
these are from the same distribution
TTEST(A3A8,B3B8,1,1)
So for these sets of data, the answer is
inconclusive. We cant tell if theres a
significant difference between the data sets.
20
Measurement and Statistics
  • 2. When you make a measurement, can you believe
    the result? How sure are you of the result?
  • CHECKING A SERIES OF VALUES
  • We'd like to know if a series of values matches a
    predicted distribution. In other words, we have
    a theory of what an experiment should give - do
    the results in fact match the theory? Chi-Squared
    tables are available for this purpose.
  •  
  • Calculate Chi - Squared

where O Observed and E Expected.  
21
Measurement and Statistics
  • 2. When you make a measurement, can you believe
    the result? How sure are you of the result?
  • CHECKING A SERIES OF VALUES
  • Example
  • Suppose a random number generator is invoked 200
    times and produces values shown in this table

Range Number of Values 0.0 - 0.1 23 0.1 -
0.2 22 0.2 - 0.3 19 0.3 - 0.4 15 0.4 -
0.5 22 0.5 - 0.6 21 0.6 - 0.7 20 0.7 -
0.8 16 0.8 - 0.9 21 0.9 - 1.0 21
Plugging this into the equation gives
There are nine degrees of freedom. From the
chi-squared distribution at this same
website.   Look along the 9-degree row and find
that 3.1 is between 3.325 (0.050) and 2.700
(0.025) - interpolated as approximately
0.040.   We can reject the hypothesis the results
are the same with a probability of about 4.
Conversely, we can be 96 sure the distribution
is uniform.   Exercise Do this same calculation
using the Chi Squared Function in Excel.
22
Measurement and Statistics
  • 3. How should you state the result of an
    experiment? How do you reflect your belief in
    its accuracy?

Pat has developed a new product, "rabbit" about
which she wishes to determine performance. There
is special interest in comparing the new product,
rabbit to the old product, turtle, since the
product was rewritten for performance reasons.
(Pat had used Performance Engineering techniques
and thus knew that rabbit was "about twice as
fast" as turtle.) The measurements
showed   Performance Comparisons   Product
Transactions / second Seconds/
transaction Seconds to process
transaction Turtle 30 0.0333 3 Rabbit
60 0.0166 1 Which of the following
statements reflect the performance comparison of
rabbit and turtle?  
o Rabbit is 100 faster than turtle. o Rabbit is
twice as fast as turtle. o Rabbit takes 1/2 as
long as turtle. o Rabbit takes 1/3 as long as
turtle. o Rabbit takes 100 less time than turtle.
o Rabbit takes 200 less time than turtle. o
Turtle is 50 as fast as rabbit. o Turtle is 50
slower than rabbit. o Turtle takes 200 longer
than rabbit. o Turtle takes 300 longer than
rabbit.
23
Measurement and Statistics
  • 3. How should you state the result of an
    experiment? How do you reflect your belief in
    its accuracy?
  • The guiding principle in stating a result is to
    keep it simple.
  • State the accuracy using the same methods we've
    just discussed. Use Means, Standard Deviations,
    and Confidence Intervals.
  • Include the number of decimal points that reflect
    the accuracy of your answer. Avoid things like
    7.365 with standard deviation of 2.
  • It goes without saying that reflecting your
    belief in the accuracy presupposes youve done
    the experiment correctly. Some simple
    guidelines
  • In my experience, you always do the experiment
    wrong the first five times. Through experience
    you learn to look critically at your result to
    see if it makes sense. If not, then you go
    figure out what went wrong. Usually its some
    parameter that wasnt controlled.
  • Only vary one parameter at a time.
  • Watch out for interactions between parameters.
    The result of changing one parameter results in
    some other parameter changing as well.
  • Dont do too many or too few experiments.
  • Get someone else to check your results by the
    time you finish a measurement you have too much
    invested in it and are very likely to miss
    something obvious.

24
Measurement and Statistics
  • 4. Can one number represent the performance of a
    product?

Answer No, but you'll be asked to do it
anyway.   Preparation For This Section some
definitions   Mean or Expected Value
Median That value for which theres an equal
probability of being above it and below it.
Mode The most likely value. The value with the
highest probability.
Mode
Median
Mean
25
Measurement and Statistics
  • 4. Can one number represent the performance of a
    product?

Example The Performance Group at the XYZ
Corporation has developed a synthetic workload
that they feel reflects the kind of computer work
done by XYZ's "typical" customer. This workload
is composed of various programs driven by a
remote terminal emulator ( RTE ). The RTE can
both initiate programs and log when the programs
complete.   This workload was run last week with
results shown in the table   Results of XYZ Corp
Performance Benchmark Transaction Type Time to
complete transaction Edit a file 14
sec Compile and link a file 143 sec Run
compiled program 17 sec 200 disk
reads 6 sec 1000 process reschedules 3
sec 100 physical page faults 10 sec Send and
receive mail 57 sec   TOTAL TIME 250
sec NOTE Because all these programs are
started simultaneously, there is contention for
resources.   The time reported to management was
250 seconds.
26
Measurement and Statistics
  • 4. Can one number represent the performance of a
    product?
  • Example
  •  
  • Questions
  •  
  • Is this a good performance indicator?
  • If yes, then sit and relax a few minutes.
  • If no, how would you express the results of these
    tests? How might you revamp the tests?
  • What guidelines can be derived for producing
    one-number performance metrics?

27
Measurement and Statistics
  • 5. When have you measured enough?
  • This is really two questions
  •  
  • When have you measured enough to get the accuracy
    of answer that management expects at this time?
  •  
  • This is a matter of setting the correct
    expectations before you start. Many times the
    answer is in response to a what if question
    you can get the appropriate accuracy in one hour.
    Other times youll need weeks of
    design/setup/measurement/analysis to get the
    expected accuracy.
  •  
  • NOTE Only a small amount of the total
    experimental time is in the measurement. Most
    time goes for design and elimination of unwanted
    factors. So this question could be stated as
    How complicated should an experiment be?
  • When have you measured enough to get the degree
    of accuracy you expected for the experiment?
  •  
  • You can use the confidence measures we discussed
    before. In essence, confidence is

28
Measurement and Statistics
  • 5. When have you measured enough?

  The relationship between the number of required
samples and experimental parameters is  
Here n number of samples required z the
number of deviations of the desired confidence s
Standard Deviation r The
desired accuracy in percent. xmean The mean
of the measurement   NOTE See that the more
accuracy you want (s), the more measurements you
need. NOTE If your numbers all come out the
same, stop. Measurement uncertainty is not the
largest part of the error in your metric.
29
Measurement and Statistics
  • 6. Figures don't lie, but liars figure. How do
    you extrapolate from what you know to what you'd
    like to know?
  • Often we need a result that is unmeasurable, or
    would require eons to determine. Is it legal to
    guess?
  •  
  • Answer
  • Sure - as long as you also estimate the
    uncertainty of your guess.
  •  
  •  
  • Here are a few practice situations that will help
    you improve your powers of estimation. Remember,
    there is no RIGHT answer.
  •  
  • Estimate how many people will come to this class
    next week. More important than the answer is the
    assumptions you use for your answer.
  • Approximately how many cars were in the parking
    lot outside this building when you came in
    tonight? How many are there now?
  • What is the probability that you will be killed
    in a car accident?
  • I recently saw a lawn service truck that had
    printed on its side Over 7 trillion blades cut.
    Is this a reasonable claim for them to make?

30
Measurement and Statistics
  • 6. Figures don't lie, but liars figure. How do
    you extrapolate from what you know to what you'd
    like to know?

5. Here is a comic strip version of an
approximation problem. It contains a model, and
then an estimation of the required parameters in
the model.
6. But be careful sometimes the model doesnt
work.
31
Measurement and Statistics
  • 7. How do you know what tools to use?
  • We'll do a lot more on tools later, but for right
    now, the best answer is to measure the simplest
    way possible.
  • Usually tools are easier to come by than
    environments.
  • Make sure the tool is less granular than the
    required uncertainty.

32
Measurement and Statistics
  • 8. Is everything in a computer measurable?
  • Some electrical signals may not be available.
  • The place to make a measurement is in code not
    under your control.
  • We have a very poor sense of typical/normal. We
    don't know what our users typically do with the
    machine.
  • The measurement may perturb the system and
    destroy what we wanted to know.
  • Available measurements may not relate to what I
    want to know. For instance, which disk blocks
    are being accessed by each of the processes on a
    system.

33
Measurement and Statistics
  • 9. How do you know what to measure?
  • This is the hardest question of all. To know
    what to measure you must have a picture or model
    of your product. Most of the rest of this course
    will deal with various kinds of pictures.
  • Often an adequate model is a causal one first
    procedure A executes this causes hardware B to
    produce an effect then interrupt code handles
    the hardware result etc.
  • Things to keep in mind include
  • Interaction between variables do you expect a
    change in X to produce a change in Y? You should
    have a guess as to the result before you make the
    measurement.
  • Changing one variable at a time, and measuring it
    at 10 different values, can be extremely wasteful
    and time consuming.
  • Change only the variables that matter. If you
    dont know, try changing something, just once,
    and see what happens
  • Example You wish to design an experiment that
    will measure the time required to execute a
    program on various Intel processors.. What
    parameters would you need to vary to try
    different processors and configurations? DESIGN
    THE TESTS TO BE RUN.

34
Measurement and Statistics
  • 10. Should you always know the results of a
    measurement before you make it?
  • You should always have a guess so you can tell if
    your result is way off. That guess should be the
    result of a model/theory of how the mechanism you
    are measuring is working

35
Measurement and Statistics
  • 11. How do you figure out dependencies how does
    one variable depend on another ?

This whole topic is something called linear
regression. It says that if you can plot two
variables, x and y, and theres a simple
relationship between the variables, then you can
define the dependency between them.
 
 
Good SIMPLE Model.
Good Complicated Model BAD
Model
A linear regression means that we can fit a curve
of the form y a bx. The quality of the fit
(error) can be defined as the sum of the y
distances between the fitting-curve and the
experimental data.  
36
Measurement and Statistics
  • 11. How do you figure out dependencies how does
    one variable depend on another ?

  So the best fit is defined to be the curve
that minimizes the sum of errors squared.
with the constraint that
When you solve this, you can immediately
determine the values of a and b from the
experimental data.
and
and
with
37
Measurement and Statistics
  • 11. How do you figure out dependencies how does
    one variable depend on another ?

Lets uses as an example the following pairs of
data (14,2), (16,5), 27,7), (42,9), (39,10),
(50,13), (83,20). We COULD use the equation above
to determine a and b.   Or, Excel can be used in
the same way and gives the same results.
The equation in this case is Y 0.036
0.25449 X.
38
Measurement and Statistics
(14,2), (16,5), 27,7), (42,9), (39,10),
(50,13), (83,20). Also, if you know what youre
doing, you can use Tools? Data Analysis ?
Regression and Excel will give you all kinds of
statistics evaluating the goodness of fit of the
straight line. (Note that you may need to use
Tools?Options to bring in the analysis tools.)
 If the model youre expecting isnt a straight
line, then youll need to do more sophisticated
analysis, but the method follows in the same way
as weve just done.
39
Measurement and Statistics
  • 12. So after all this talk about the details of
    measurement, how do you actually design an
    experiment?

Were going to follow through these steps and
recommend that you use them in your experiments.
(These are originally due to Jain.)
  • State Goals and Define The System
  • What is it you hope to accomplish? Why is it
    worth doing?
  • What is the hardware and software (the system)
    that you will use to achieve these goals?
  •  
  • 2. List Services and Outcomes
  • a.       For the system youve chosen, what are
    the services provided. For instance, if youre
    studying a disk subsystem, it can absorb data
    (write) or present you with data (read) or give
    an error.
  • b.      By outcomes here are meant very high
    level statements. The outcome of a disk read is
    DATA. Its not a performance or quantifiable
    answer expected here.
  •  
  • 3. Select Metrics
  • a.      What are the criteria you want to use to
    compare performance? This is still not a
    quantifiable value, but simply what it is you
    will measure. This could be a speed metric, or
    an accuracy metric.

40
Measurement and Statistics
4. List Parameters a.       What parameters
affect performance? If youre measuring disks,
then the model of disk determines its seek time,
its rotational latency, etc. This is a system
parameter. b.       The kind of test you use,
determined by the workload you use, can also
define parameters. These might be requested IOs
per second, random or sequential blocks,
etc.   5. Select Factors to Study a.       A
factor is a parameter that you vary. b.      So,
for the parameters youve just listed all of
which you COULD vary, which ones will you
actually modify during the course of the
experiment? 6. Select Evaluation
Technique a.      You could do this experiment by
modeling. You would mathematically represent the
system under study and modify parameters in this
model. b.      You could do this experiment by
simulation. You would write a program that
represented the system. Again you could modify
parameters and look at results. c.      You could
do this experiment by measurement. Here you have
a real system, drive it with some kind of
workload, and get the results. d.     In
practice, in industry, only measurements are
valued. Its generally cheaper to use the real
system than it is to build a mathematical or
simulated system.
41
Measurement and Statistics
7. Select Workload a.       How will you drive
the system under test? b.     It depends on the
Evaluation Technique. With a simulation you may
have collected some data that you can feed into
your program. c.      For a measurement
evaluation, you will have some kind of software
that drives the system youre testing. You will
need to find a workload that tickles the
parameter of interest to you.   8. Design
Experiments a.       What experiments will you do
to collect the data you want? b.     This means
selecting the actual values to be used as
factors. If one of your factors is the
type/model of disk, then how many different disks
will you use?
42
Measurement and Statistics
9. Make A Guess What The Result Will Be a.    
Many people take a measurement and say Oh, that
must be right. The best way to be able to make
that statement is to have understood what should
happen and then either get what you expected or
not. b.       If you get whats expected, then
you can be confident that v      You understand
a picture of how the system is working. v     
You did your measurements correctly. c.       If
you DONT get whats expected, then you can be
confident that v      You didnt understand the
system and so you need to form a new
picture. v      You did the measurement wrong
theres some experimental error.   10. Conduct
the Measurement, Analyze and Interpret
Data a.       Now actually do the measurement,
simulation, or whatever youve designed. b.      
Its rare that you just get a number and youre
all done. c.       There is always interpretation
to be done v      What does the data
mean? v      Is this the result I would
expect? d.       There are always statistics to
be done v      Is the data valid? v      What is
the uncertainty in the measurements?
43
Measurement and Statistics
11. Figure Out What You Want To Talk
About a.     Know your audience. Are they
management types (who want only an overview) or
are they technical people (who want all the
details.) Proper targeting is important! b.     
Choose from all the data you have, those pieces
that are most relevant. Dont forget to make it
interesting!   12. Present Final Results c.      
As you know, in the real world, its not what you
do, its what others think you do. d.      
Presentation is everything.
44
Measurement and Statistics
BONUS There are various terms and definitions
we never got around to formally defining. Here
they are
Definitions of Measured Data These are some
basic terms to define so we have a common
lingo. Independent Events Two events are
independent if theres no way that the occurrence
of the first event can have anything to do with
the second event. Random Variate A variable
that can take on one of a particular set of
values with a specified probability. Cumulative
Distribution Function The CDF maps a given value
to the probability that the variable has a value
equal to or less than a.
Probability Density Function The deriviative of
the CDF
Gives the probability of x being in the interval
(x1, x2).
45
Measurement and Statistics
Definitions of Measured Data These are some
basic terms to define so we have a common
lingo. Probability Mass Function The equivalent
of the PDF but used for discrete variables. Mean
or Expected Value
Variance A measure of the deviation of the
values from the mean.
Standard Deviation This is another measure of the
deviation of values. Represented by m, the
square root of the variance.  
46
Measurement and Statistics
Definitions of Measured Data
  Covariance Given two random variables x and y
with means mx and my, their covariance is
For independent variables, the covariance is 0.
Correlation Coefficient Another measure of how
two variables are interdependent.
Median That value for which theres an equal
probability of being above it and below it.
Mode The most likely value. The value with
the highest probability. Normal Distribution
The most commonly used distribution. The sum of
a large number of independent observations from
any distribution has a normal distribution.
Write a Comment
User Comments (0)
About PowerShow.com