Title: p.1
1Pangea An Eager Database Replication Middleware
guaranteeing Snapshot Isolation without
Modification of Database Servers
- 2009.8.27
- The University of Tokyo
- Takeshi MISHIMA and Hiroshi NAKAMURA
2Introduction
- Several database replication approaches have been
proposed - Is database replication a solved problem?
- No! Few approaches are cost-effective
- Most database replication approaches need to
modify database servers - The code is large and complex too expensive
replication functionalities
database servers (replicas)
3Introduction
- Several database replication approaches have been
proposed - Is database replication a solved problem?
- No! Few approaches are cost-effective
- Most database replication approaches need to
modify database servers - The code is large and complex too expensive
replication functionalities
database servers (replicas)
4Disadvantages of existing lazy replication
middlewares
Ri
Wj
Wj
Wj
Wi
Wi
Wi
Rj
master
slave
slave
5Disadvantages of existing eager replication
middlewares
Wk
Wi
Wi
Wi
6Our proposal
- Pangea a new eager replication middleware
- Guarantees snapshot isolation (SI)
- Provides tuple-level concurrency control
Assumption Wi and Wj conflict Wk does not conflict
Wj
Wj
Wj
Wi
Wi
Wi
Wk
Wk
Wk
7Challenge 1 creation of the same snapshot
- Database servers do not have the functionality to
create the same snapshot in a synchronized
fashion.
8Recap snapshot creation
- When is a snapshot created?
- When is a database changed?
Ti Ri Wi Ci
Tis snapshot
9Recap snapshot creation
- When is a snapshot created?
- When the first operation is executed
- When is a database changed?
- When the transaction commits
Ti Ri Wi Ci
Tis snapshot
10Recap a version of a snapshot
time
D1
database
D2
snapshot
11Algorithm 1 the same snapshot creation
Cm
Cm
Cm
Cm
Fk
Cj
Fk
Fk
Cj
Cj
Ci
Ci
Ci
ack
ack
ack
ack
ack
ack
ack
ack
ack
12Challenge 2 execution of non-conflicting and
conflicting writes
- To provide high performance with keeping
consistency - Conflicting write operation
- Executed in the same order serially (to keep
consistency) - Non-conflicting write operation
- Executed concurrently (to provide higher
performance) - ex.
- UPDATE table SET a 1 WHERE b1
- UPDATE table SET a 2 WHERE c2
?All existing eager replication middlewares can
not distinguish them, and all writes must be
executed serially.
Conflict or non-conflict?
13Recap the first updater wins rule
Wj
Wi
Assumption Wi and Wj conflict
ack
Execution order
Wj
Wi
14Algorithm 2 execution of non-conflicting and
conflicting writes
Wj
Wj
Wj
Wi
Wi
Wi
Pangea
Wk
Wk
Wk
Assumption Wi and Wj conflict Wk does not conflict
ack
ack
follower
leader
follower
15Summary Pangea protocol
- The first operation, commit operation (Algorithm
1) - Mutual exclusion
- (This guarantees all replicas create the same
snapshot) - Write operation (Algorithm 2)
- Sends it only to the leader
- sends it to all followers after receiving an ack
- (This provides tuple-level concurrency control)
- Read operation
- Sends it to any slave replica unlike lazy
replication - (This reduces the load of the leader)
?See the proof in our paper
16Evaluation
- TPC-W benchmark
- Scaling parameters 1 million items and 2.88
million customers - Comparing
- Pangea
- LRM (Lazy Replication Middleware)
- Implemented the algorithm of Ganymed 17
- Prototype
- Less than 2000 lines in C language
- PostgreSQL servers version 8.3.7
- Tomcat version 6.0.18
17Performance (browsing mix)
18Performance (shopping mix)
19Performance (ordering mix)
20Conclusions
- Pangea, a new eager database replication
middleware - Guarantees SI
- Provides tuple-level concurrency control
- Need not modify existing database servers
- Prototype The code size is very small
- Experimental results
- Provides higher performance
- Practical, effective middleware
21