On the Foundations of Artificial Workload - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

On the Foundations of Artificial Workload

Description:

On the Foundations of Artificial Workload. Domenico Ferrari. Presentation by. Hari Rangarajan ... System S can be abstracted as a analytic or simulation model. ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 31
Provided by: hari52
Category:

less

Transcript and Presenter's Notes

Title: On the Foundations of Artificial Workload


1
On the Foundations of Artificial Workload
  • Domenico Ferrari

Presentation by Hari Rangarajan
2
Outline of presentation
  • Introduction of workloads
  • Workload design methods
  • Problems of design
  • Artificial Interactive Workload Design
  • Conclusions

3
What is a Workload
  • Suppose we have a system S , we need a number to
    quantify how well it performs

Performance Indices F(System , Workload)
Workload (Jobs, I/O requests)
S Cpu , Disk Ctl
Output
  • Performance indices are usually
    (Responsetime, Throughput, Resource Utilisation)
    expressed as Vector P

4
Why Model a Workload
  • System S can be abstracted as a analytic or
    simulation model.
  • System performance needs to be analysed , which
    needs a workload , and hence a model of the
    workload
  • System can also be real in which case workload
    model can be run on it and results obtained ,
    called Measurement approach

5
Measurement Approach
  • Applicable only to real , measurable workloads
  • Natural Workload
  • Sample of a real workload , chosen according to a
    criteria e.g Time Of Day
  • Artificial Workload
  • Synthetic benchmarks and scripts
  • Must be executable on the systems

6
Artificial Workloads
  • Pros
  • Better Reproducible
  • Can model future workloads
  • Easier portability
  • Cons
  • More expensive to build
  • Potentially less accurate
  • Needs time on the target system

7
Generic Design method
  • How to construct a executable model of a real
    measurable workload
  • Identify basic components of the workload. Ex
    job,interaction,command
  • Parametrize components by
  • Physical resources CPUs , Main Memory etc
  • Logical resources Language processors, editors
    etc
  • Functional Resources Compiling , editing etc.
  • Component F (Phy Res, Log Res, Func Res)

8
Generic Design Method (contd..)
  • Measurement from real workload while executing on
    system
  • Statistical Analysis
  • Analyse parameter distributions , transforming
    measurements within limits
  • Sample To reduce processing time and storage
  • Static Analysis Classification and partitioning
    of workload components
  • Clustering, principal component analysis Reduce
    the given data set into classes of homogeneous
    components

9
Generic Design Methods (contd ..)
  • Statistical Analysis
  • Dynamic analysis
  • Properties of time series is to be considered ,
    essential when time varying characteristics of
    the workload is important.
  • Numerical Fitting , Statistical Analysis of non
    stationary series of events, stochastic process
    modelling are used.

10
Design Method Summary
  • The values of parameters are measured for each
    component making a tuple.
  • Statistical techniques (clustering and sampling)
    applied to tuples reducing the number of tuples
    representing the components
  • Each tuple is now replaced by a workload
    component that is characterised by the tuple
    constituting a workload model

11
Representativeness of Model
  • How accurate is the Workload model
  • W is the real workload, P is the performance
    indice
  • W is the artificial workload , P performance
    indice
  • Model is ACCURATE if PP

P
W
P
W
S
S
12
Problems in design methods
  • Parameters represent resource demands at various
    abstract levels.
  • What resources should be included in
    characterisation
  • Necessity and sufficiency of the parameters is
    not known

13
Problems (contd)
  • Accuracy of a workload model is not defined .
  • No metric is available to qualify or quantify
  • Statistical techniques are applied to population
    of tuples which do not contain temporal
    information ignoring dynamic behaviour.

14
Design of Artificial Interactive workload
  • Interactive system with m users
  • Product Form solution
  • Performance Indices can be obtained by solving
    the network
  • Ignore the dynamics of the workload

Central Subsytem (N-1) Stations
users
1
2
m
.
Users type a sequence of commands
15
Modelling the system
  • Identify basic component of the system Job
    command or interaction
  • Measurement

16
User Behaviour Graph
  • Each state represents a command
  • Users type a sequence of commands with a defined
    probability

Dormant User is not In the system
Login
Logout Of system
Quit
17
Reflecting the graph in the model
  • There are R different command types in the graph
  • Model R different classes of customers with
    defined probability of changing classes
    (executing jobs)
  • User can change class only when from a station in
    the central subsystem to Station 1.

18
Illustration
Central Subsytem (N-1) Stations
Class/Command b
Users change classes With a branching probability
Class/Command a
1
2
m
users
19
Building on the model
  • Next Step - Generic design methods reduce the
    number of command types of the workload
    proportionally ignoring sequential links
  • How valid is this assumption ?
  • Theorem 1
  • The equilibrium state probabilities of the
    queueing network are invariant to any change in
    the user behaviour graph which does not modify
    the visit ratios of the command types.
  • Proves assumption is valid

20
Implication of Theorem 1
  • Replace User behavior graph by a equivalent one

21
Comments on Theorem 1
  • Workload models built by this method cannot be
    implemented in arbitrary way
  • Model will be performance-wise accurate if its
    simulates the same number of users as in the
    workload or an equivalent graph
  • Problem - does this probabilistic graph
    satisfactorily map the behaviour of all real
    users in a system.
  • Model loses the accuracy if we change the no of
    users , behaviour at each node

22
Static analysis on Workload Model
  • Apply Clustering technique
  • Each Command type in UBG is characterised by
    distributions of service times , branching
    probabilities , distribution of terminal times.
  • Map them in state space and cloud neighborhood
    clusters
  • Clustering reduces the state space

23
Clustering on the UBG
Four super classes
Nine classes of commands
24
Does clustering affect validity of model
  • Define global performance indices mean
    throughput rate, mean response time,
    utilisations, mean queue lengths and waiting
    times
  • Theorem 2 Validating Clustering
  • The values of all global performance indices of
    the queuing network are invariant wrt
    aggregations of classes with identical demands if
    each superclass has a visit ratio in the UBG
    equal to the sum of the visit ratios of its
    members , and each non aggregated class retains
    its previous ratio

25
Theorem 2 - Observations
  • Clustering can produce very accurate results
    provided no of users , behaviour remains
    unchanged from the original unclassed graph
  • One representative per cluster is enough
  • Clustering is better than other reduction methods
    (based on study)

26
Insights on modelling user workloads
  • Suppose workload is described by a collection of
    disjoint user behaviour graphs
  • Performance oriented model acccuracy can be
    obtained only if each graph is dealt with
    separately

A B C D
User Behaviour Graph of A,B,C,D Type customers
27
Summary of design
Artificial Workload Model
  • This artificial workload model is able to emulate
    the characteristics of the original workload
    inaccordance with the set of performance criteria
    we are interested in

28
Conclusions
  • Problem 1 - Choosing the parameters which have a
    significant effect on Performance
  • Solution
  • Characterise the workload with resources that are
    explicitly specified in the queuing model
  • More resources that the queueing model can take
    into account for the performance , the workload
    can characterise those resources
  • Ex in our case , no of users , the command types

29
Conclusion (contd..)
  • Problem 2 Accuracy of the workload model
  • Consider the global performance indices which we
    are interested in
  • Theorems proved the accuracy of the final
    workload model constructed as long as they
    representative of the original user behaviour
    graph
  • Static analysis techniques like sampling and
    clustering are valid (in our case)

30
Conclusions
  • Problem 3 Ignoring dynamics of the workload
    when doing static analysis
  • Dynamics need not be considered always
  • System performance indices do not depend on the
    order of execution of commands ( in our case)
  • Dynamics should be considered when
  • Order of execution is important scheduling
  • Solution cannot be applied here
  • Violates steady state assumption
  • May not satisfy product form queuing model
Write a Comment
User Comments (0)
About PowerShow.com