Distributed Operating Systems CS551 - PowerPoint PPT Presentation

About This Presentation
Title:

Distributed Operating Systems CS551

Description:

DFSs 'support the sharing of information in the form of files throughout ... Upload/Download Model. Server. Client. Client's copy. Updated File. Original File ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 61
Provided by: scha67
Category:

less

Transcript and Presenter's Notes

Title: Distributed Operating Systems CS551


1
Distributed Operating SystemsCS551
  • Colorado State University
  • at Lockheed-Martin
  • Lecture 8 -- Spring 2001

2
CS551 Lecture 8
  • Topics
  • Distributed File Systems (Chapter 8)
  • Distributed Name Service
  • Distributed File Service
  • Distributed Directory Service
  • NFS
  • X.500
  • Distributed Synchronization (Chapter 10)
  • Global Time
  • Physical Clocks
  • Network Time Protocol (NTP)
  • Logical Clocks

3
Definitions
  • DFSs support the sharing of information in the
    form of files throughout an intranet. A
    well-designed file service provides access to
    files stored at a server with performance and
    reliability similar to files stored on local
    disks. A distributed file system enables
    programs to store and access remote files exactly
    as they do local ones, allowing users to access
    files from any computer in an intranet.
    (Coulouris, Dollimore, Kindberg, 2001)

4
Definitions, continued
  • in a DS, it is important to distinguish between
    the concepts of the file service and the file
    server. The file service is the specification of
    what the file system offers to its clients the
    file systems interface to the clients. A file
    server, in contrast, is a process that runs on
    some machine and helps implement the file
    service. A system may have one file server or
    several. (Tanenbaum, 1995)

5
Upload/Download Model
Server
Client
Clients copy
Original File
Updated File
Adapted from Tanenbaum (1995)
6
Remote Access Model
Server
Client
Client requests access from remote file
File does not move
Adapted from Tanenbaum (1995)
7
Terms
  • File system
  • an abstract view of secondary storage
  • responsible for
  • Global naming
  • File access
  • Overall file organization
  • Distributed Name Service
  • focuses on the issues related to filenames

8
Basic File Systems
  • File Storage
  • Structured versus non-structured
  • File Attributes
  • File name, size, owner, creation/modification
    dates, version, protection information
  • File Protection Modes
  • Read, write, execute, append, truncate, delete

9
Figure 8.4  Structured versus Unstructured Files.
10
Figure 8.5   Access Matrix.
11
Figure 8.6  Access List for File 1.
12
Goals of a DFS
  • Network Transparency
  • Looks like a traditional file system on a
    mainframe
  • User need not know a files location
  • High Availability
  • Users should have easy access to files, wherever
    the users or files are located
  • Tolerant of failures

13
Architecture
  • On the Network
  • File servers hold the files
  • Clients make accesses to the servers
  • Name Server (does name resolution)
  • Maps names to directories/files
  • Cache Manager
  • Implements file caching
  • Often at both server and clients
  • Coordinates to avoid inconsistent file copies

14
Mechanisms of a DFS
  • Mounting
  • Binding together of different filename spaces to
    form a single name space
  • A name space is mounted to (or bounded to) a
    mount point (or node in the name space)
  • Need to maintain mount information
  • Keep it at the clients
  • Keep it at the servers

15
Name Space Hierarchy
Server X
a
c
b
d
e
f
g
h
i
Server Y
j
k
Server Z
Adapted from Singhal Shivaratri (1994)
16
Mechanisms Mounting, cont.
  • Keep it at the clients
  • Client must mount each required file system
  • e.g. Suns NFS
  • Each client can see a different filename space
  • When moving files, each client may need updating
  • Keep it at the servers
  • Each client sees identical filename space
  • If files are moved between servers, only need to
    update servers information

17
Mechanisms, continued
  • Caching
  • Clients get copy of remote file information
  • Local memory, local disk, server memory
  • Improves performance
  • Hints
  • Guaranteeing that all data in cache is always
    valid is expensive
  • Some cached data can be used as a hint
  • If shown valid, then time is saved
  • If found invalid, can recover without serious
    problems
  • E.g. cache location of a file

18
Mechanisms, concluded
  • Bulk Data Transfer
  • Big cost of communication is the communication
    protocol
  • So send multiple data blocks on each transfer
  • Less communication overhead
  • Less context switching
  • Fewer acknowledgements
  • Encryption
  • Enforce security
  • Before communication between two entities, use an
    authentication server to provide a key

19
DFS Design Issues
  • Naming and Name Resolution
  • Caches on Disk or Main Memory
  • Writing Policy
  • Cache Consistency
  • Availability
  • Scalability
  • Semantics

20
Naming and Name Resolution
  • Name Resolution
  • The process of mapping a name to an object, or
    in the case of replication, multiple objects
    (SS 94)
  • Name Space
  • a collection of names which may or may not share
    an identical resolution mechanism (SS 94)

21
Name Space Hierarchy
Server X
a
c
b
d
e
f
g
h
i
Server Y
j
k
Server Z
Adapted from Singhal Shivaratri (1994)
22
Figure 8.10  Name Space and Mounting in NFS.
23
Naming Definitions
  • Location independent A file can be moved
    without changing the filename
  • Location transparent Filename does not tell
    where the file is located

24
Location Transparency
  • Must be provided via global naming
  • Dependent on a name being location independent
  • E.g. a universal name
  • Example social security number versus home
    street address

25
Figure 8.1  Telephone Routing.
26
Global Naming and Name Transparency
  • A global name space requires
  • Name resolution
  • Location resolution
  • Name resolution maps symbolic filenames to
    computer file names
  • Location resolution involves mapping global names
    to a location
  • Difficult if both name transparency and location
    transparency are both supported

27
Figure 8.2  IP Hierarchical Name Space.
28
Naming Approaches
  • Add host name to names of files on that host
  • Provides unique names
  • Loses network transparency
  • Loses location transparency
  • Moving file to a different host causes change of
    filename
  • Possible changes to applications using that file
  • Easy to find a file

29
Naming Approaches, continued
  • Mount remote directories onto local directories
  • To do the mount, need to know host
  • Once mounted, references are location transparent
  • Can resolve filenames easily
  • However, a difficult approach to do
  • Not fault tolerant
  • File migration requires lots of updates

30
Naming Approaches, concluded
  • Use a single global directory
  • Does not have disadvantages of previous
    approaches
  • Variations found in Sprite and Apollo
  • Need a single computing facility or a few with
    lots of cooperation
  • Need system-wide unique filenames
  • Not good on a heterogeneous system
  • Not good on a wide geographic system

31
Naming Issues, continued
  • Contexts
  • Used to partition a name space
  • To avoid problems with system-wide unique names
  • Geographical, organizational, etc.
  • A name space in which to resolve a name
  • A filename has two parts
  • Context
  • Local filename
  • Almost like another level of directory
  • Used in x-Kernel logical file system

32
Naming Issues, concluded
  • Name Server
  • Maps names to files and directories
  • Centralized
  • Easy to use
  • A bottleneck
  • Not fault tolerant
  • Distributed
  • Servers deal with different domains
  • Several servers may be needed to deal with all
    the components in a filename

33
Figure 8.3  Distributed Solution for Name
Resolution.
34
DFS Design Issues, continued
  • File Cache Location
  • Main Memory
  • Can support diskless workstations
  • Faster
  • Similar to design of server memory cache
  • Competes with virtual memory system for space
  • Try to avoid data blocks being in both cache and
    virtual memory
  • Cant cache a large file
  • So needs to be able to handle blocks
    (block-oriented)

35
DFS Design Issues, continued
  • Cache Location, continued
  • Local Disk
  • Able to handle large files without affecting
    performance
  • Doesnt affect virtual memory system
  • Permits incorporation of portable workstations
    into distributed system
  • As per Coda

36
DFS Design Issues, continued
  • Cache Writing Policy
  • When should a modified cache block be sent to the
    server?
  • Write-through
  • Send all writes immediately to the servers
  • Reliable, little lost if there is a crash
  • Lose advantage of having a cache
  • Delayed writing

37
DFS Design Issues, continued
  • Cache Writing Policy, continued
  • Delayed writing
  • Forward writes to server after a delay
  • E.g. when a block is full
  • E.g. when the file is closed
  • E.g. when timer goes off (say every 30 seconds)
  • Takes advantage of cache
  • Crash could lose some data
  • What about short-lived files (e.g. temps)?
  • Perhaps server need not know about these

38
DFS Design Issues, continued
  • Cache Consistency
  • Server-Initiated
  • Server tells client that data needs to be updated
  • I.e. server needs good records
  • Client cache managers invalidate old data
  • Client-Initiated
  • Client cache manager makes sure clients data is
    okay with server before using
  • Then why bother with cache at all?
  • Both these are expensive and require cooperation
    between clients and servers

39
DFS Design Issues, continued
  • Cache Consistency, continued
  • Alternative
  • Do not allow file caching of shared, writeable
    files
  • As a concurrent-write sharing file may be open at
    multiple clients with at least one client writing
  • Server needs to keep track of clients sharing
    files
  • Can be avoided by locking files

40
DFS Design Issues, continued
  • Cache Consistency, concluded
  • Issue Sequential-write sharing
  • Occurs when a client opens a file that has been
    modified recently and closed by another client
  • Problem 1
  • When client opens a file, it may have outdated
    blocks in its cache
  • Solution use timestamps on files and cached
    blocks
  • Problem 2
  • When client opens a file, current data blocks may
    still be waiting to be flushed in another
    clients cache
  • Solution Require all clients to flush modified
    file blocks when a new client opens file for
    writing

41
Figure 8.7  Approaches to Modification
Notification.
42
DFS Design Issues, continued
  • Availability
  • Files can be unavailable due to server failures
  • Availability achieved through replication
  • Copies at different servers
  • Problems
  • Overhead (file space)
  • Consistency
  • Need to maintain
  • Need to detect and correct inconsistencies

43
Availability, continued
  • Unit of replication
  • A file is the most common unit
  • Cedar, Roe, Sprite
  • Overall replica management is harder
  • Directory information about file may need to be
    stored (e.g. protection info)
  • Replicas of files belonging to a common directory
    may not have common file servers, requiring extra
    name resolutions

44
Availability, continued
  • Unit of replication, continued
  • A group of files or Volume
  • Used by Coda
  • Easier to associate information with the group
  • A waste if most of the files are not really
    shared
  • Compromise
  • Used in Locus
  • A users files are a file group (primary pack)
  • A replica may just contain a subset of the pack

45
Availability, concluded
  • Replication Management
  • Keeps mutual consistency among the copies
  • Suggest a weighted voting scheme
  • Reads/writes can happen only by votes from
    current copies
  • Timestamps are kept on current copies
  • Designate on or more processes as agents for
    controlling access to copies
  • Locus each file group has a synchronization
    site
  • Harp a primary file server controls access

46
Figure 8.8  Employing a Mapping Table for
Intermediate File Handles.
47
Figure 8.9  Distributed File Replication
Employing Group Communication.
48
DFS Design Issues Scalability
  • Can the design deal with system as it grows?
  • Caching is used to improve client response time
  • But it introduces cache consistency problems

49
Scalability, continued
  • Server-initiated invalidation
  • Server keeps track of sharers
  • Notifies them if file is changed
  • Large system gt busy server
  • Helps to note if file is read-only
  • Form a tree
  • Server only deals with only delta clients
    directly
  • Each of these clients can serve delta clients
  • Etc. forming a tree for messages to propagate

50
Scalability, continued
  • Server structure
  • Decides how many clients a server can support
  • Single process that blocks during the I/O
  • Horrible all clients must wait
  • Separate process per client
  • Context switching overhead from frequent requests
    from different clients
  • Thread per client
  • Cheaper context switching

51
Scalability, continued
  • Principle
  • Minimize cross-machine interaction
  • Use caching, hints, relaxed sharing semantics
  • Stringent semantics are less scalable
  • Avoid central control and central resources
  • Central authentication service, name server, etc.
  • Desire symmetry and autonomy
  • Each machine has equal role
  • Decentralized system administration

52
Scalability, concluded
  • Principle, concluded
  • Clustering
  • Partition system into a collection of clusters
  • Cluster set of machines plus cluster server
  • Hope most requests are satisfied by local cluster
    server
  • Balance and locality
  • With reasonable locality, clusters can be a
    scalable building block

53
DFS Design Issues Semantics
  • Characterizes the effects of accesses on files
  • Basic (Unix semantics)
  • A read operation returns the data stored by the
    last write operation
  • Expensive
  • Need a single coordinating server
  • OR no sharing
  • Or users need to use locks

54
Semantics, concluded
  • Session semantics
  • Writes are visible immediately to local clients
  • Changes to a file are visible to remote clients,
    only after closing the file
  • No attempt to maintain consistency

55
Distributed Directory Service
  • Directory Structures
  • Hierarchical
  • Acyclic
  • E.g., Unix
  • Cyclic
  • Directory Management
  • List of active directories with files
  • Storage of directory structure

56
Directory Tree on one machine
A
B
C
D
E
Adapted from Tanenbaum (1995)
57
Directory Graph on two machines
A
0
B
C
1
2
D
E
1
1
Adapted from Tanenbaum (1995)
58
Distributed Directory Service
  • Directory Operations
  • Directory service
  • Create, rename, delete directories, etc.
  • File service
  • Create, rename, delete files, etc.

59
File Types
  • Library files (.lib, .dll)
  • Program files (.c, .cpp, .p, .java, .f)
  • Object-code files (.o, .obj)
  • Compressed files (.zip, .Z, .gz)
  • Archive files (.arc, .tar, .jar)
  • Graphic files (.gif, .jpeg, .ps, .dvi)
  • Sound files (.wav, .midi)
  • Index files (.idx)
  • Document files (.doc, .tex. ,wp)

60
Example DFSs
  • Sun NFS
  • Sprite
  • Apollo DOMAIN
  • X-Kernel
  • Coda
  • Andrew
  • Amoeba
Write a Comment
User Comments (0)
About PowerShow.com