Title: KPCToolbox: Simple Yet Effective Trace Fitting Using Markovian Arrival Processes
1KPC-Toolbox Simple Yet Effective Trace Fitting
Using Markovian Arrival Processes
College of William and Mary Department of
Computer ScienceWilliamsburg, Virginia
- Giuliano Casale
- Eddy Z. Zhang
- Evgenia Smirni
- casale,eddy,esmirni_at_cs.wm.edu
- Speaker Giuliano Casale
QEST 2008 St. Malo, France
2Burstiness in Measured Workloads
Hyper-Exponential
Independent
Temporal dependence
Temporal dependence
2
3Goals of this work
- Fitting traces with temporal dependence harder
but useful - Burstiness often associated to large performance
slowdown - Analytical modeling of bursty phenomena often
based onMarkov-modulated processes (e.g., IPPs,
MMPPs, MAPs, ) - MAPs tractable, general, but hard to fit!
- Open challenges
- How do we fit large MAPs?
- Which statistical descriptors matter the most in
MAP fitting? - Tools how do we fit MAPs automatically?
3
4Outline
- How do we fit large MAPs?
- Kronecker Product Composition (KPC) method
- Which statistical descriptors matter the most in
MAP fitting? - Sensitivity analysis of queueing models
- Tools how do we fit automatically?
- KPC-Toolbox algorithmic implementation
- Practical implementation of KPC fitting
- Automatic MAP order selection
4
5Markovian Arrival Process (MAP)
- Exponential sojourn times in each state (Markov
process) - Descriptors simple algebraic functions of D0 and
D1
m12
0 m12 0
-S... 0 s13
State 1
State 2
s13
D1
D0
0 0 m23
0 -S... s23
0 0 0
s31 0 -S...
s23
s31
BackgroundTransitions
Job arrivals (Tagged Transitions)
m23
State 3
Moments (Mean, CV, )
Joint Moments (ACF, Correlations)
Embedded DTMC (Temporal Dependence Descriptor)
5
6Complexity of MAP Fitting
easy
EX, EX2, EXiXij,
- E.g., how to impose EX5? Fifth order nonlinear
equation - Mathematical constraints to obtain well-formed
MAPs - Sign constraint on entries of D0 and D1
- A MAP(n) has limited degrees of freedom
TelekHorvath07 - Moments and correlations values may be infeasible
- E.g., MAP(2) autocorrelation always smaller than
0.5 - MAP fitting often intractable by exact
moment/correl. matching
(D0, D1)
hard
6
7Example of Naive Fitting
- Pragmatic approach match a MAP(2) by exact
formulas - Example Bellcore August-89 trace
Simulated Trace/M/1 queue
Solve MAP(2)/M/1 queue
7
8How to Fit Large MAPs?
9Kronecker Product Composition (KPC)
- Method to obtain large MAPs with predefined
properties - Composition by Kronecker products
- KPC Properties Composition of Statistics
(Moments, correlations) - KPC is a divide-and-conquer approach to MAP
fitting - E.g., KPC mean 1 if MAPa has mean 2 and MAPb has
mean 1/2
KPC Process
9
10KPC Divide-and-Conquer Fitting
- MAP(2) has 4 Degrees of Freedom (3 moments ACF
lag-1)
4 Deg. Freedom
4 Deg. Freedom
- DC KPC fitting
- Fit a collection of MAP(2) by exact fitting
formulas - Choose moments and ACF of MAP(2)s to impose
desired moments and correlations in final MAP - Problem what do we want to impose in the final
MAP? - Which moments?
- Which correlations?
MAP(2)
MAP(2)
KPC
4 Deg. Freedom
MAP(4)
MAP(2)
8 Deg. Freedom
KPC
MAP(8)
KPC Process 12 Deg. Freedom
10
11Which descriptors matter?
12Sensitivity Analysis Methodology
- Performance of MAP/M/1 buffer overflow
probability - Sensitivity wrt maximum queue-length (overflow
prob.lt10-8) - Step 1 MAP(2)/M/1 sensitivity analysis
- Changes of MEAN, SCV, SKEW, ACF lag-1
- Step 2 validation using MAP(4)/M/1 sensitivity
analysis
SKEW(fixed SCV)
Higher Order Moments
Higher Order Correlations
Higher-order Descriptors
12
13Sensitivity to Higher-Order Moments
- SKEW impact always strong, SCV impact sometimes
strong
Autocorrelated MAP(2) SCV10, SKEW5
Smaller SCV
Maximum Queue-Length
Maximum Queue-Length
Larger SKEW
Heavier Load ?
Heavier Load ?
13
14Conclusions on MAP(2) Sensitivity
- SKEW impact can much larger than SCV impact
- Change of SKEW is unclear metric
- higher-order moments (tail)
- higher-order correlation
- MAP(2)/M/1 without correlations shows low impact
of SKEW - Likely to be an effect of higher-order
correlations - Are higher-order correlations critical
performance drivers?
14
15MAP(4) Sensitivity
- We use KPC to generate two MAP(4)s
- Higher-order moments and ACF identical
- Higher-order correlations very different
- A MAP(4) has much larger temporal dep. than the
other
Larger Dependence
Maximum Queue-Length
Smaller Dependence
15
16Sensitivity Analysis Conclusions
- Moment matching indeed important but
- Fitting higher-order correlations has priority
over fitting higher-order moments - Our general proposal for MAP fitting focuses on
correlations - Use KPC methodology to fit large MAPs
- Fit three moments to capture trace distribution
- Mostly focus on fitting ACF and higher-order
correlations
KPC-Toolbox Algorithmic Implementation
16
17How to fit automatically?(KPC-Toolbox)
18KPC-Toolbox Design
Order Selection
Extract Statistics
Moments ACF HO Correlations
MAP Size N
Trace
Nonlinear Optimization
log2N (Number of MAP(2)s)
MAP(2)
MAP(2)
MAP(2)
MAP(2)
Randomize Initial Point
KPC
MAP(N)
18
19MAP Order Selection
- Bayesian Information Criterion (BIC)
- Popular for ARIMA model order selection
- MAP(n) property n autocorrelation coefficients
always related by linear equation - BIC Order Selection
- Linear regression modelon estimate ACF
coefficients - BIC value assesses cost of model size
MAP(8)
MAP(16)
MAP(32)
20Nonlinear Optimization
- Step 1 Match autocorrelations and SCV
- Returns only SCV and ACF of MAP(2)s
- (D0, D1) description not yet generated
- Step 2 Nonlinear least squares
- Assign MEAN and SKEW of MAP(2)s
- We impose constraints on (D0, D1) feasibility
- Objective function seeks best bi-correlations
matching - Correlations correlation of two samples (e.g.,
EX0X1) - Bi-correlations correlation of three samples
(e.g., EX0X1X2)
21Tool Validation
22BIC Order Selection
- Bellcore Aug-89 and Seagate Web traces
- MAP(16) and MAP(32) often best cost-accuracy
trade-off - Manual fitting we had best results with MAP(16)
22
23Networking trace
- Bellcore Aug89 queueing results
23
24Disk drive trace
- Seagate Web queueing results
24
25Conclusions
- How do we fit large MAPs?
- Kronecker Product Composition (KPC)
- Which statistical descriptors matter the most in
MAP fitting? - Bet on higher-order correlations (e.g.,
bi-correlations) - Tools how do we fit automatically?
- BIC order selection
- We automatically select descriptors
- Optimization-based search
- Supported by NSF grants ITR-0428330 and
CNS-0720699
25
26http//www.cs.wm.edu/MAPQN/kpctoolbox.html