Title: Fine-Grained Merging
1Fine-Grained Merging
Prasun Dewan
Department of Computer Science University of
North Carolina dewan_at_unc.edu
2Issues raised by Coda
- Directory and file hoarding and merging
- Smaller grain than file.
- Non persistent data
3TACT Approach
- Programmer-defined consistency unit - called
conit. - Each write indicates which conits it changes.
- ModifyBibliograhyItem (key, new val)
- do the modification
- affectConit (key)
- Language-independent.
- Burden on programmer to identify conit.
- Since conits procedurally defined - no hope of
automatic merging.
4Issues raised by Coda
- Automatic directory merging
- Inflexible resolution
- May want both server and client to delete for
delete to succeed (user cleaning up local hoard)
5Drawing Application Joint Editing
- Concurrent editing of same object is conflict
User B
User A
6Alternative policy exclusive editing
- For safety, as plans get finalized, concurrent
editing of drawing is conflict.
7Programmer-specified criteria
- Programmers should be able to specify the
consistency criteria used for their application
Consistency criteria for application X
serializable schedules (transactions execute as
if one after the other)
Consistency criteria for application Y
8User-specified consistency criteria
- Users cannot use low-level mechansisms (code,
predicates)
- but can use high-level mechanisms (tables,
buttons)
9Limitations of Coda
- Automatic but inflexible merging for directories.
- Flexible but manual merging for files.
- Supported types limited to directory and files.
10Suite and Sync
- How to support merging of types other than files
and directories - Support a larger set of types.
- Programming language rather than OS types
atomic, record, sequence, hashtable. - Files modelled by atomic.
- Directory modelled by hashtable.
- How to provide automatic merging for files?
- Must know how to decompose the file.
- Assume it can be modeled using standard
programming language types.
11Suite and Sync
- How to allow users and application programmers to
specify the merge policy? - Like the access control problem.
- Like an access matrix define a merge matrix.
- Declarative mechanism usable by end-user and
programmer. - Higher-level than procedural mechanisms of merge
procedures.
12Suite vs Sync
- Suite based on extension of C.
- C program preprocessed.
- Variables of C programs merged.
- Sync based on Java.
- Special classes, ReplicatedAtomic,
ReplicatedSequence, ReplicatedDictionary defined. - Subclasses of these mergeable.
- Cannot handle arbitrary classes of objects
- Rover
13Merge processoverview
14Merge processoverview
- Create change sets
- Merge them
15Merge processoverview
- Create change sets
- Merge them
- Apply results
16Merge processoverview
- Create change sets
- Merge them
- Apply results
- Table-driven algorithm
17Type-Based Merge Algorithm
- Mergeable versions of
- Record
- Sequence
- Dictionary
- Integer, Float, String
- Shared data defined by hierarchical composition
of these types
18Merge matrixoverview
- Table that specifies outcome of one merge
decision - Entries of table specified by programmers and
users
19Merge matrixgeneral form
- Rows represent one change set, columns represent
other change set. - Matrix entries (merge actions) specify how
conflicts between operations are resolved.
Change set 2 operations
Change set 1 operations
20Merge actiongeneral form
- General form of matrix action is a function
O1, O2 A1, A2
21Merge matrixAtomic
22Merge Matrix - Record
E.g. fields name, phone, address
Record Modify(name) Modify(phone) Modify(address) null
Modify(name) Merge Both Both Row
Modify(phone) Both Merge Both Row
Modify(address) Both Both Merge Row
null Column Column Column
23Merge matrixRecord
24Merge matrix Philosophy
- Each element conflicts with each other element of
parent - At higher level indicate conflict
- In example at address book level, say
modification of same record is conflict - Or an element conflicts only with corresponding
element - At higher level recurse
- Resolve conflict at element level
- Approach essential to accommodate dynamic
structures - Merge matrix would have unbounded and composition
if each hashtable key had its own entry
25Merge matrixDictionary
26Merge matrixSequence
27Multiple merge policies
Consolidation
Reconciliation
28Change sets
- Not linear logs
- Structured, mirroring structure of data
- more efficient access
- automatic compaction
- computed as changes are made
- operations call setChanged()
29Planner Application
30Change sets
Temporary wall here
31Change sets
Temporary wall here
32Change sets
Temporary wall here
Its not a table, its a wall
33Change sets
Temporary wall here
Its not a table, its a wall
34Merge algorithm
- Pair up corresponding changes from change sets
- For each pair
- Look up action in merge matrix
- If action is merge call merge procedure
recursively on changed structure, else perform
indicated action
35Merge algorithm
36Merge algorithm
37Merge algorithm
38Merge algorithm
39Merge algorithm
40Merge algorithm
41Merge algorithm
42Merge algorithm
AtomicMergeUnit true
43Merge algorithm
44Suite merge tool
45Directory Merging choice1
46Directory Merging choice2
47Drawing Application Joint Editing
- Concurrent editing of same object is conflict
User B
User A
48Alternative policy exclusive editing
- For safety, as plans get finalized, concurrent
editing of drawing is conflict.
AtomicMergeUnit true
49Large policy space
- Merge matrices defined at each structural level
of shared data - Structure merge matrices policy space
50But not arbitrary policies
- Local knowledge used to detect and resolve
conflicts. E.g. - Put checked against put, remove, modify of the
same key. - Not any other keys, not any other operations
51Why more flexibility? Room Scheduler
- Synchronization should disallow concurrent
insertions of reservations that overlap. - Straightforward approach
- Key with each room, storing reservation record
- Cannot apply merge matrix
- Complicated approach
- Key is Room date ½ hour slot
- SN 115 1/30 1030
- Reservation of n ½ hr slots involves n keys
- Can use merge matrix
52Per-Operation Merge Procedure
- Synchronization should disallow concurrent
insertions of reservations that overlap. - Fix by trying user-specified alternative times.
- Bayou provides per operation dependency check
and merge procedure.
53Classes of Mergeable Objects
- Coda
- Files and Directories.
- Bayou
- Tables
- Sync
- Subclasses of predefined Sync classes
- Not arbitrary objects - e.g. Stack
- Rover
- Arbitrary objects
- But programmer must pay!
54Classes of Replicated Objects
Sync Tables, Sequences, Record, Atomic
Rover Arbitrary Objects
Coda Directory and Files
Bayou Tables
55State Transition Diagram
- Serve cache miss
- Mark stale data
- Background adapted hoard walk
- Track changes trickle merging
- Serve cache miss
- Hoard walk
- Replace stale data
- Write through
Disconnection
Logical Reconnection
Physical Reconnection
- Detect conflicts
- Resolve conflicts
- Provide access to cached, possibly stale data
- Track changes to cached data
56Supporting Arbitrary Objects
- System has no knowledge of object
- Cannot automatically merge.
- Cannot automatically compress logs.
- What can it do?
- Keep track of connection state
- Queue methods in emulation state based on
priority. - Call application-provided procedures to compress
logs. - Call queued methods during merging.
- Rover-based consistency divergence
- TACT -based consistency divergence.
- Disallow local methods
- if remote object changed and sufficient
connectivity present to check - local object has expired
- Call application-provided procedures to check
and fix conflicts during merging. - Behave as a toolkit!
57Incremental Customization?
Flexibility
Ease of specification
58Sync Incremental Customization
- Policy inheritance
- Policy specification
- Policy implementation
- Merge--aware types
- Merge matrices and lock tables
- Subclassing, new types
- merge algorithms, systems