Title: Disconnected Operations in CODA File System
1Disconnected Operations in CODA File System
JAMES J. KISTLER and M.SATYANARAYANAN (Carnegie
Mellon University)
Presentation Review By Pathik Patel
University of Southern California
2CODA File System
- Developed for Distributed Machines
- Used in Disconnected Operations Environment
- Based on AFS (Andrew File System)
- Developed in UNIX Environment
- Replicates server for high availability
- Uses Cache Management for high availability
3CODA Design Overview
Volume Server Group
Client Accessible VSGs
Unix File Server
Clients With Venus as Cache Manager
4Mechanisms to Increase Availability
- Server Replication
- Managing Disconnected Operations using Venus
- Updating Server When Disconnection ends
- Replica Updating after Disconnection
5Working Scenario for CODA
X12
X12
X12
X87
X87
X87
X33
X87
X87
Nunki
Aludra
Pollux
Nunki
Aludra
Pollux
Nunki
Aludra
Pollux
CL-3
CL-3
CL-3
CL-1
X87
X12
X87
CL-2
X12
CL-2
CL-2
CL-1
X87
X87
X12
X33
(a)
(b)
(c)
X45
X45
X87
X87
X45
X45
X45
X87
X87
Nunki
Aludra
Pollux
Nunki
Aludra
Pollux
Nunki
Aludra
Pollux
CL-3
CL-3
X45
X87
CL-3
CL-2
CL-2
CL-1
CL-1
CL-2
X87
X45
X87
X45
X45
(f)
X33
(e)
(d)
6Design Rationale
Considerations Are influenced by Following
Factors
- Scalability
- Portable Workstations
- First Vs. Second Class Replication
- Optimistic Vs. Pessimistic Replica Control
7Design Rationale (contd)
- Scalability
- Whole-file caching
- Placing burden on Clients rather than Servers
- Avoidance of System wide changes
- Portable Workstations
- They are moving
- Have selective usage of File
- Users can predict their disconnection in most
cases
8Design Rationale (contd)
- First Vs. Second Class Replication
- First Class replicas reside on Server, That are
- reliable, clean, available complete.
- Second Class replica reside on Clients, That are
- inferior to all above dimensions.
- Cache Coherence protocol is used to synchronize
- both type of replica.
- Degraded second class replica is modified after
- reconnection.
- CODA also support sole server use of replica in
- case of performance cost related issues.
9Design Rationale (contd)
- Pessimistic Vs. Optimistic Replica Control
- Pessimistic approach requires client to acquire
exclusive control - before caching disconnection retain
that until reconnection - Pessimistic approach is used when disconnection
short - degrades performance in long disconnections
- Optimistic approach requires updating replica as
new - updates are released
- Optimistic approach requires sophisticated
hardware - software to manage replicas
- In optimistic approach each client has its own
- accessible universe to which it sends gets
updates
10CODA Design Implementation
Application
To CODA Servers
VENUS
System Call Interface
Vnode Interface
CODA MiniCache
Structure of the CODA Client
11CODA Design Implementation (contd)
- CODA Client structure
- Venus is a User-level process
- To decrease overhead on Venus a Minicache
resides - in kernel
- Minicache intercepts the file system calls via
vnode - interface forwards to the Minicache
- If cache miss occurs then venus takes over
forward - it to CODA server
12CODA Design Implementation (contd)
Hoarding
Logical Reconnection
Disconnection
Reintegration
Emulation
Physical Reconnection
Venus States Transitions
13CODA Design Implementation (contd)
- Hoarding
- Hoarding is a process to caching management of
data - in case of disconnection
- Many factors like cache miss ratio,
Disconnection frequency, - cache space, freshness of cached object affects
the - performance of hoarding
- To manage this CODA uses Hoard database, hoard
profiles - so that a hoard-walk on this database can
update state of - cache using HDB Rules.
- Hoard database contains hoard profile which can
be easily - created using command-line or simply specifying
file importance
14CODA Design Implementation (contd)
- Emulation
- During emulation sufficient logs are generated
about the update - activity, so that cache reintegration process
can be managed easily - Emulation uses a RVM (Recoverable Virtual
Memory) to store - meta-data about cached objects so that they can
be retrieved - in case of crash recovery during reconnection
- Emulation enables user to start his work after a
shut-down or crash - from where he left off
- Emulation also exhaust resources because replay
logs file cache - modified with updates become large to mange
15CODA Design Implementation (contd)
- Reintegration
- Reintegration is a transitory state through
which Venus passes in - changing roles from pseudo-server to cache
manager. - In reintegration process Venus propagates
through replay logs so that - cache can be updated to reflect current server
state. - It has two major processes
- 1) Replay Algorithm
- 2) Conflict Handling
16CODA Design Implementation (contd)
- Replay Algorithm
- First Venus gets the permanent fids for cached
object from server - updates its temporary fids in relay log
- After that reply log is parsed so that updating
of cached objects can - be processed
- After successful updating venus flushes its old
cache, but in case of - unsuccessful reintegration venus writes reply
log in to a local - replay file then a refetch is made for all
object of cache entries. - Replay of logs can be done at different
granularity levels of files - according to the user needs likewise we can
reintegrate the related - objects of the cached object.
17CODA Design Implementation (contd)
- Conflict Handling
- It may happen that disconnected operations of
one client may - conflict with activity at servers or other
disconnected clients. - Conflicts are managed using a storeid for each
cached object. - During reintegration this storeid is compared to
its own replica for - each entry of cached object in log.
- Different algorithms are used to manage
conflicts of - directories files
18Critique of the Paper
- Strengths
- Detailed description about Application
- Fine granular explanation
- Weakness
- Very detailed explanation about working of
low-level process - Has not pointed towards any commercial
implementations - Relevance to embedded systems
- Great work done until
- No more research publication from past 4 years
- I think that project is freezed