Title: DCVS Distributed Concurrent Versions System
1DCVS Distributed Concurrent Versions System
- Project Group Sumit Mittal
- Ajay Gulati Supratik
Majumder
Special Thanks to Dr. Cox, Dr. Druschel and Tracy
2Overview of CVS system
- Repository stored on a single server
- Client-server model
Central Repository
Latest file
Check out
User 4
update
files
diff
commit
User 1
Difference With working copy
User 3
User 2
3Overview of Distributed CVS
Server User
Server User
Server User
Server User
request
Server User
request
Server User
Server User
reply
Server User
Server User
Server User
reply
Server User
4DCVS vs. CVS
- Goals
- Availability High -
Low - Fault tolerance Some K - 1
- Scalability Very high - Low
- DCVS requirements
- Consistency
- Transparency
CVS
DCVS
5Our DCVS System
A
B
I
H
C
G
D
F
E
- Nodes distributed on a circular ID-space
- Each node stores a part of the repository
- Directories replicated on the nodes
6Distribution of Directories
A
B
I
H
C
G
D
F
E
Nodes
- Each directory hashed onto the circular ID-space
Directories
7Distribution of Directories
A
D1
D5
D2
D4
D3
B
I
D6
H
C
G
F
D
Nodes
E
Directories
8Replication
Replication Factor - 3
Directory myDir
Primary Node
A
B
C
Secondary Nodes
9Example Check-out
I want directory myDir
A
I
B
H
Routing of request
myDir
Request completion
C
G
Version upgrade
D
F
E
Assume replication is 3
10Example Commit
Assume replication is 3
I want to check-in directory myDir
A
B
I
H
myDir
Request routing
C
G
F
D
E
11Example Commit
Assume replication is 3
I want to check-in directory myDir
A
B
I
H
myDir
C
G
Copy-set
F
D
E
12Example Commit
Assume replication is 3
I want to check-in directory myDir
A
B
I
H
myDir
Request for acknowledgement
C
G
F
D
E
13Example Commit
Assume replication is 3
I want to check-in directory myDir
A
B
I
Acknowledgements sent
H
myDir
C
G
F
D
E
14Example Commit
Assume replication is 3
I want to check-in directory myDir
A
B
I
H
myDir
Run requests
C
G
F
D
E
15No free lunch!What do you pay?
- Overhead
- Maintaining consistency
- Providing transparency
- Keeping replicas
- Executing CVS commands
- Cost of joining/leaving the group
Lets see the cost of lunch!!
16Formal Claims
- Consistency Protocols for checkin/
checkout/update ensure consistency - Transparency User is oblivious to the location
of directories - Reliability and fault tolerance Up to K/2
nodes can fail for each directory
K is the replication factor
17Experimental Claims
- Overhead (delay) is O(log N)
- Overhead constant with respect to K
- Overhead constant with D
N Number of nodes in the system K Replication
Factor D Number of directories
18Experimental Claims
- Join/leave latency constant with N
- Join/leave latency increases linearly with K
- Join/leave latency increases linearly with D
N Number of nodes in the system K Replication
Factor D Number of directories Join/leave
latency measured as no. of directories transferred
19Experimental Setup
- Implemented on PASTRY
- Wrote a driver to run the simulations
- No specific architecture used - simulations run
on available Linux machines (rivendel, nautilus) - Nodes simulated limited by capability of single
machine
20No. of Replicas vs. Overhead
Delay (no. of hops)
12 16 20 24 28
32 36
No. of replicas
21Nodes vs. Overhead
Delay (no. of hops)
10 20 40 50 70 80 100 150 200 250 300
No of nodes
22Directories vs. Overhead
Delay (no. of hops)
1 10 50 100 200 350 500 600
800 1000
No of directories
23Latency vs. Replicas
No. of directories transferred
12 16 20 24 28 32
36
No. of replicas
24Latency vs. Nodes
No. of directories transferred
70 80 100 150 200 250 300
No. of nodes
25Latency vs. Directories
No. of directories transferred
1 10 50 100 200 350 500
600 800 1000
No. of directories
26Conclusion
- Consistency and transparency ensured
- Advantages far outweigh the overheads
- Scalability Overheads are constant/
sublinear/linear
So the lunch is really cheap!!
27Discussion