Aashish Phansalkar - PowerPoint PPT Presentation

About This Presentation
Title:

Aashish Phansalkar

Description:

A quantitative method to estimate performance without running cycle accurate simulation ... Method 1: Predicting performance using weights ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 29
Provided by: aash
Learn more at: http://www.spec.org
Category:

less

Transcript and Presenter's Notes

Title: Aashish Phansalkar


1

Performance Prediction Using Program Similarity
  • Aashish Phansalkar
  • Lizy K. John

The University of Texas at Austin
2
Outline
  • Motivation and Objectives
  • Methodology
  • Experimental results
  • Conclusion
  • Future work

3
Motivation (1) Simulation is costly
  • A computer architect or a designer has to
    simulate multiple customer applications
  • Simulations take very long due to the complexity
    of modern microprocessor designs

4
Motivation(2) Making a decision based on
benchmark scores
  • Customers often use benchmarks to make a decision
    about buying computer systems
  • The application program they use often, may not
    be a part of the benchmark suite
  • Customers can use benchmarks as representatives
    of their application programs
  • Predict performance of their application based on
    the already available performance data of
    benchmarks

5
Objective
  • A quantitative method to estimate performance
    without running cycle accurate simulation
  • Use the knowledge of similarity between a
    customers application program and known
    benchmark programs to develop a quantitative
    approach to predict performance

6
Outline
  • Motivation and Objectives
  • Methodology
  • Experimental results
  • Conclusion
  • Future work

7
Overview
New Case
Known cases
Repository of Benchmarks
Customer application
Measure similarity
Predicted performance
8
Program characterization
  • Instruction mix
  • Percentage of different types of instructions
  • e.g. percentage of memory references,
    percentage of branch instructions
  • Control Flow
  • Taken branches
  • Forward branches
  • Forward taken branches
  • Basic Block Size (Number of instructions
    between two branches)
  • Register Dependency Distance
  • Data and instruction temporal locality of program
  • Data and instruction spatial locality of program

9

Register dependency distance
  • ADD R1, R3,R4
  • MUL R5,R3,R2
  • ADD R5,R3,R6
  • LD R4, (R8)
  • SUB R8,R2,R1

Read After Write Dependency Distance 4
Measure Distribution of of dependency distances
for following set of ranges. 1, 2, 3-4, 5-8,
8-16, 16-32, greater than 32 Normalized count
for each range of dependency distance forms a
metric
10
Data and instruction temporal locality
  • Memory reuse distance
  • 2004, 2008, 4008, 2000, 1080,2004,4008

Reuse Distance 4
Reuse Distance 3
  • Computing reuse distance for a trace of byte
    addresses is very computation and space intensive
  • Reuse distance for a block of 16, 64, 256, 4096
    bytes
  • Temporal locality metrics (tlocality) Wt.
    average reuse distance

11
Data and instruction spatial locality
  • Spatial locality metrics are derived from the
    temporal locality metrics
  • As the block size increases, programs with good
    spatial locality will show lower values for
    tlocality for higher block sizes
  • Spatial Locality tlocality64 / tlocality16
  • tlocality256 /
    tlocality16
  • tlocality4096 /
    tlocality16

12
Methodology Overview
Microarchitecture independent metrics for known
benchmarks
Microarchitecture independent metrics for the
customer application
Measure program similarity
Similarity information
Prediction of target metric for new
application (2 methods)
Predicted value of target metric
13
Measuring Similarity (1)
  • Distance between two programs in the workload
    space is the measure of their similarity
  • We assume that similarity between two programs is
    inversely proportional to the Euclidean distance
    between them

14
Measuring similarity (2)
  • The workload space is made of many workload
    characteristics and so its dimensionality is very
    high
  • Inherent characteristics are highly correlated
  • Euclidean distance measured using these
    characteristics will be biased
  • The correlated variables will add twice to the
    distance as the independent variables
  • Use Principal Components Analysis (PCA)

15
Method 1 Predicting performance using weights
  • Compute distance of similarity from program X to
    each benchmark program dx1, dx2, dx3dxn in the
    PC space
  • Calculate weights w1, w2, .

w1
User program X
w2
benchmarks
16
Method 2 Predicting performance using
clustering
  • Measure all the inherent characteristics for the
    benchmarks and user program X
  • Cluster all the programs based on the inherent
    characteristics and find optimal clusters

User program X
benchmarks
17
Outline
  • Motivation and Objectives
  • Methodology
  • Experimental results
  • Conclusion
  • Future work

18
Experiments
  • Used integer programs from SPEC CPU2000 suite to
    demonstrate the use of Method 1 and Method 2
    described
  • Prediction of speedup
  • Used all the workload characteristics to form the
    workload space
  • Prediction of cache miss-rates
  • Used only the data locality characteristics to
    form the workload space

19

Predicting speedup(1)
  • Experiment Predict performance (speedup) of
    bzip2 using benchmarks from SPEC CPU2000 suite
  • Assume that bzip2 is the customer application
  • Performance of SPEC CPU2000 benchmarks is known

Speedup for each benchmark program on a machine
(from the scores reported on the SPEC website)
20
Predicting speedup(2)
Method 1Predicting speedup using weights Machine
name SGI-Altix 3000 (1500MHz, Itanium 2)
21
Predicting speedup (3)
Method 1 Predicting speedup using weights For 50
different machines the error in predicted speedup
22
Predicting speedup (4)
Method 2 Predicting speedup using clustering
The average error in predicting the speedup over
all machines for bzip2 is 20.29
23
Prediction of data cache miss rates(1)
Method 1 Using weights for prediction
Note Assume every program to be a customer
application one at a time
24
Prediction of data cache miss rates(2)
Method 2 Using clustering for prediction
Note Assume every program to be a customer
application one at a time
25
Outline
  • Motivation and Objectives
  • Methodology
  • Experimental results
  • Conclusion
  • Future work

26
Conclusion
  • Demonstrated two simple methods to predict
    performance
  • Used SPEC CPU2000 as an example to predict
    performance.
  • The accuracy of prediction depends on two
    factors
  • How well the workload characteristics correlate
    to performance
  • Is there a program similar to the customer
    application in the repository of known programs

27
Future Work
  • Two main items on the TO DO list
  • To add more programs to the repository and
    validate the results
  • To calibrate the measure of similarity (distance)
    in workload space to the error in the target
    metric space.

28
  • Thank you !!
Write a Comment
User Comments (0)
About PowerShow.com