Lect 16: Benchmarks and Performance Metrics - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Lect 16: Benchmarks and Performance Metrics

Description:

... for all programs; new set of programs 'benchmarks useful for 3 years' ... To focus the attention on the benchmarks where the performance is easiest to improve ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 18
Provided by: mae74
Category:

less

Transcript and Presenter's Notes

Title: Lect 16: Benchmarks and Performance Metrics


1
Lect 16 Benchmarks and Performance Metrics
2
Measurement Tools
  • Benchmarks, Traces, Mixes
  • Cost, delay, area, power estimation
  • Simulation (many levels)
  • ISA, RT, Gate, Circuit
  • Queuing Theory
  • Rules of Thumb
  • Fundamental Laws

3
Marketing Metrics
  • MIPS Instruction Count / Time 106 Clock
    Rate / CPI 106
  • Machines with different instruction sets ?
  • Programs with different instruction mixes ?
  • Dynamic frequency of instructions
  • Uncorrelated with performance
  • MFLOPS FP Operations / Time 106
  • Machine dependent
  • Often not where time is spent

Normalized add,sub,compare,mult 1 divide,
sqrt 4 exp, sin, . . . 8
4
Fallacies and Pitfalls
  • MIPS is an accurate measure for comparing
    performance among computers
  • dependent on the instr. set
  • varies between programs on the same computer
  • can vary inversely to performance
  • MFLOPS is a consistent and useful measure of
    performance
  • dependent on the machine and on the program
  • not applicable outside the floating-point
    performance
  • the set of floating-point ops is not consistent
    across the machines

5
Programs to Evaluate Processor Performance
  • (Toy) Benchmarks
  • 10-100 line program
  • e.g. sieve, puzzle, quicksort
  • Synthetic Benchmarks
  • Attempt to match average frequencies of real
    workloads
  • e.g., Whetstone, dhrystone
  • Kernels
  • Time critical excerpts of real programs
  • e.g., Livermore loops
  • Real programs
  • e.g., gcc, spice

6
Types of Benchmarks
  • Architectural
  • Synthetic mixes WHETSTONE, DHRYSTONE, ...
  • Algorithmic
  • LINPACK
  • Kernels
  • Self contained sub-programs such as PDE without
    Input/Output
  • Production
  • Working code for a significant problem
  • PERFECT and SPEC
  • Workload

7
Levels of Benchmark Specification
  • Problem Statement
  • Algorithm code production
  • Reflect more the effort and skill than it does
    the system capability
  • Solution Method
  • NASA Ames
  • Reflect more the effort and skill than it does
    the system capability
  • Source Language Code
  • Performing the same operation
  • necessary baseline from which to measure the
    effectiveness of smart compiler options

8
Benchmarking Games
  • Differing configurations used to run the same
    workload on two systems
  • Compiler wired to optimize the workload
  • Workload arbitrarily picked
  • Very small benchmarks used
  • Benchmarks manually translated to optimize
    performance

9
Common Benchmarking Mistakes
  • Only average behavior represented in test
    workload
  • Not ensuring same initial conditions
  • Benchmark engineering
  • particular optimization
  • different compilers or preprocessors
  • runtime libraries

10
Benchmarks
  • DHRYSTONE
  • A synthetic benchmark
  • Non-numeric system-type programming
  • Contains fewer loops, simpler calculation and
    more if statements
  • C code
  • LINPACK
  • Argonne National Lab
  • Solution of linear equations in FORTRAN
    environment
  • Solution method and the code levels
  • Vectorised processors
  • SPEC
  • Standard Performance Evaluation Corp.
  • Non-profit group of computer vendors, systems
    integrators, universities, research
    organizations, publishers and consultants
    throughout the world
  • http//www.specbench.org

11
SPEC
  • Groups
  • Open Systems Group(OSG)
  • CPU committee
  • SFS committee file server benchmarks
  • SDM committee multi-user Unix Commands
    Benchmarks
  • High Performance Group(HPG)
  • SMP, Workstation Clusters, DSM, Vector
    Processors, ..
  • Graphics Performance Characterization Groups(GPC)
  • What metrics can be measured?
  • CINT 95 and CFP 95
  • C component level benchmarks
  • the performance of the processor, the memory
    architecture and the compiler
  • I/O, networking, or graphics not measured by
    CINT 95 and CFP 95
  • S system level benchmarks

12
SPEC System Performance Evaluation Cooperative
  • First Round 1989
  • 10 programs yielding a single number
  • Second Round 1992
  • SpecInt92 (6 integer programs) and SpecFP92 (14
    floating point programs
  • VAX-11/780
  • Third Round 1995
  • Single flag setting for all programs new set of
    programs benchmarks useful for 3 years
  • non-baseline, baseline
  • SPARCstation 10 Model 40
  • Fourth Round 1998
  • Under development

13
SPEC First Round
  • One program 99 of time in single line of code
  • New front-end compiler could improve dramatically

14
CPU95
  • CINT95
  • 099.go An internationally ranked go-playing
    program.
  • 124.m88ksim A chip simulator for the Motorola
    88100 microprocessor.
  • 126.gcc Based on the GNU C compiler version
    2.5.3.
  • 129.compress A in-memory version of the common
    UNIX utility.
  • 130.li Xlisp interpreter.
  • 132.ijpeg Image compression/decompression on
    in-memory images.
  • 134.perl An interpreter for the Perl language.
  • 147.vortex An object oriented database.
  • CFP95
  • 101.tomcatv Vectorized mesh generation.
  • 102.swim Shallow water equations.
  • 103.su2cor Monte-Carlo method.
  • 104.hydro2d Navier Stokes equations.
  • 107.mgrid 3d potential field.
  • 110.applu Partial differential equations.
  • 125.turb3d Turbulence modeling.
  • 141.apsi Weather prediction.
  • 145.fpppp From Gaussian series of quantum
    chemistry benchmarks.

15
CINT95 (written in C)
  • SPECint95
  • The geometric mean of eight normalized ratios
    (one for each integer benchmark) when compiled
    with aggressive optimization for each benchmark.
  • SPECint_base95
  • The geometric mean of eight normalized ratios
    when compiled with conservative optimization for
    each benchmark.
  • SPECint_rate95
  • The geometric mean of eight normalized throughput
    ratios when compiled with aggressive optimization
    for each benchmark.
  • SPECint_rate_base95
  • The geometric mean of eight normalized throughput
    ratios when compiled with conservative
    optimization for each benchmark.

16
CFP95 (written in FORTRAN)
  • SPECfp95
  • The geometric mean of 10 normalized ratios (one
    for each floating point benchmark) when compiled
    with aggressive optimization for each benchmark.
  • SPECfp_base95
  • The geometric mean of 10 normalized ratios when
    compiled with conservative optimization for each
    benchmark.
  • SPECfp_rate95
  • The geometric mean of 10 normalized throughput
    ratios when compiled with aggressive optimization
    for each benchmark.
  • SPECfp_rate_base95
  • The geometric mean of 10 normalized throughput
    ratios when compiled with conservative
    optimization for each benchmark.

17
The Pros and Cons of Geometric Means
  • Independent of the running times of the
    individual programs
  • Independent of the reference machines
  • Do not predict execution time
  • To focus the attention on the benchmarks where
    the performance is easiest to improve
  • 2 sec --gt 1 sec , 10000 sec --gt 5000 sec
  • crack, benchmark engineering
Write a Comment
User Comments (0)
About PowerShow.com