Title: Towards WorkflowDriven Database System Workload Modeling
1Towards Workflow-Driven Database System Workload
Modeling
- Du Naiqiao, Ye Xiaojun and Wang Jianmin
2Outline
- Introduction
- Workload model
- Model with Markov process
- Model with Petri-net
- Conclusion
- Why we need workflow workload?
- How to model it?
3Two main questions
- Why we need workflow workload?
- Workflow patterns exist in real system.
- Workflow workload will affect the performance.
- How to model it?
- Markov process
- Petri net
4Introduction-motivation
- Existing benchmarks
- Transaction workload in TPC-C and TPC-App are
treated independence, only keeping the mixing
ratio - TPC-E dependence between transactions
- Intuitive e.g. E-Commerce system
- search -gt select -gt buy
- Data mining area customer access pattern exists
in OLTP system - Will such patterns affect the database
performance? HOW?
5Introduction-to do what
- Relationship between transactions should be
considered in OLTP performance test. - The relationship here is different from TPC-E
- Weak relationship----random transaction sequence
comply with some kind of access pattern. - Strong relationship----the sequence of
transactions is determined obviously. - Target
- Model the relationship between the transactions
- Keep the transaction mixing ratio (traditional
benchmark requirement) - AT THE SAME TIME.
6Introduction-Related work(1)
- Menasce et al, A Methodology for Workload
Characterization of E-commerce Sites, ACM
E-Commerce Conf 99 - Customer Behavior Model Graph (CBMG) describe
the behavior of groups of customers who exhibit
similar navigational patterns. - Show the steps required to obtain parameters (the
arrival rate of session initiation requests and
the average server-side think time). - Propose a clustering algorithm (k-means) to
characterize workloads of e-commerce sites in
terms of CBMGs.
7Introduction-Related work(2)
8Introduction-Related work(3)
- Yao et al, Mining and Modeling Database User
Access Patterns, ISMIS 2006 - Kth-order Markov process to model database users
behavior. - Data mining techniques to mine database users
access patterns - pruned Markov models the states and state
transitions are pruned according to the state
support threshold (i.e., the minimal support
value for the node) and the confidence threshold
(i.e., the minimal confidence value for the edge)
,the number of states and the number of state
transitions are reduced.
9Introduction-Related work(4)
10Two main questions
- Why we need workflow workload?
- v Workflow patterns exist in real system.
- Workflow workload will affect the performance.
- How to model it?
- Markov process
- Petri net
11Mapping between workflow workload and transaction
workload
12Model with Markov process(1)
- Transaction can be activated by controller or
other transactions. - Every transaction is a state one-order Markov
process. - State space S1,2n
- Matrix Ppij, probability that transaction j
activates directly after transaction i
13Model with Markov process(2)
14Model with Markov process(3)
15Two main questions
- Why we need workflow workload?
- v Workflow patterns exist in real system.
- Workflow workload will affect the performance.
- How to model it?
- v Markov process
- Petri net
16Model with Petri-net(1)
- Model TPC-C or TPC-App with Petri-net
- PaPbPcPd1
- One Addition condition end or not
17Model with Petri-net(2)
A,B,C and D can be replaced by Subflows. The
benefit of using Petri-net is the ability to
model parallel situations. a c b d
18Model with Petri-net(3)
Define is the expectant occurrence
times for T1 in flow A.
19Model with Petri-net(4)
20Model with Petri-net(5)
21Two main questions
- Why we need workflow workload?
- v Workflow patterns exist in real system.
- Workflow workload will affect the performance.
- v How to model it?
- v Markov process
- v Petri net
22One question remains
- How will the workflow workload (customer access
patterns) affect the performance? - Experiment by TPC-App
- As the effect of web server is not determined,
the results are not included in this paper. But
we believe the web server is not the bottle neck.
23Two main questions
- v Why we need workflow workload?
- v Workflow patterns exist in real system.
- v Workflow workload will affect the performance.
- v How to model it?
- v Markov process
- v Petri net
24Conclusion
- Transaction relationships should be considered.
- Two different methods are used to model the
relationship between transactions in OLTP
systems. - The mixing ratio for transactions is studied,
which will be benefit for designing or modifying
the OLTP performance test benchmark. - Future work
- More work on how customer access patterns will
affect the database performance. - Analysis on the parameters dependence between
related transactions - More complicated models, e.g. model the
transaction relationships with addition Petri-net
patterns.
25