Title: Distributed System Structures
1Distributed System Structures
- Background
- Topology
- Network Types
- Communication
- Communication Protocol
- Robustness
- Design Strategies
2Learning Objectives
- What is distributed system? Distributed OS?
- What are advantages of distributed system?
- What is data migration? Process migration? Why
needed? - How can pairs of processes wanting to communicate
over network be connected? Common schemes? - How failures detected in distributed systems How
does distributed system recover from failure?
3A Distributed System
4Motivation
- Resource sharing
- sharing and printing files at remote sites
- processing information in distributed database
- using remote specialized hardware devices
- Computation speedup load sharing
- Maintain responsiveness load balance
- Availability detect, recover from site failure,
function transfer, reintegrate failed site - Communication message passing
5Network Operating Systems
- Users aware of multiple machines
- Explicit access to other machine resources
- remote logging in to another machine (ssh)
- transfer data from remote to local machines
- file transfer protocol (FTP)
- secure copy (scp)
- hypertext transfer protocol (http)
6Distributed Operating Systems
- Users not aware of multiple machines
- access to remote, local resources similar
- Data Migration
- transfer data by transferring entire file, or
only portions necessary for immediate task - Process Migration
- transfer computation rather than data
7Distributed Operating Systems
- Process Migration execute entire process, or
parts, at different sites - load balancing distribute processes to even out
workload - computation speedup subprocesses can run
concurrently on different sites - hardware preference process execution may
require specialized hardware (e.g., different
CPU) - software preference required software may only
be at one site - data access run process remotely, rather than
transfer all data - Downsides?
8Topology
- Sites in system can be physically connected in
variety of ways compared with respect to
following criteria - basic cost how expensive to link sites in
system? - communication cost time to send message from
site A to B - availability if link or site fails, can
remaining sites still communicate? - Topologies depicted as graphs
- nodes correspond to sites
- edge from node A to B direct connection between
sites
9Network Topology
10Network Types
- Local-Area Network (LAN) small geographical
area - multiaccess bus, ring, or star network
- growing popularity wireless networking
- speed range ? 10 megabit/s 1Gb/s
- broadcast fast and cheap
- nodes
- usually workstations, personal computers
- fewer servers, printers
11Network Types
12Network Types
- Wide-Area Network (WAN) geographically
separated sites - point-to-point connections long-haul lines,
satellite links - speed 10s to 100s of Mbit/s
- nodes (communication processors)
- usually high percentage are routers
- can also be big servers
13Communication Processors in a WAN
14Communication
Design of communication network must address 4
basic issues
- Naming and name resolution how do 2 processes
locate each other to communicate? - Routing strategies how are messages sent through
network? - Packet strategies variable-length or fixed-size?
- Connection strategies how do 2 processes send
sequence of messages? - Contention network is shared resource, so how
are conflicting demands for use resolved?
15Naming and Name Resolution
- Name systems in network
- Address messages with process-id
- Identify processes on remote systems by
- lthost-name, identifiergt pair
- Domain name service (DNS) specifies naming
structure of the hosts, as well as
name-to-address resolution (Internet)
16Routing Strategies
- Fixed routing path from A to B specified in
advance path changes only if hardware failure
disables it - shortest path usually chosen so communication
cost minimized - fixed routing cannot adapt to load changes
- ensures messages will be delivered in order they
were sent - Virtual circuit path from A to B fixed for
session. Other sessions may have different paths
from A to B - partial remedy to adapting to load changes
- ensures messages be delivered in order sent
17Routing Strategies
- Dynamic routing path for message chosen only
when message sent - usually site sends message on link least used at
that time - adapts to load changes by avoiding routing
messages on heavily used path - messages may arrive out of order remedy
sequence number on each message
18Packet Strategies
- Fixed size e.g., ATM simplifies switching
- Reliable needs acknowledgement protocol
- Unreliable e.g., datagrams rely on higher
layers to check delivery, check order
19Connection Strategies
- Circuit switching permanent physical link for
duration of communication (e.g., telephone call) - Message switching temporary link established
for duration of 1 message transfer (e.g.,
post-office mail) - Packet switching variable-length messages
divided into fixed-length packets each may take
different path. Packets reassembled into messages
as they arrive
Circuit switching setup time, less overhead for
shipping each message, may waste bandwidth (why
is voice over IP growing?) Message, packet
switching less setup time, more overhead per
message
20Communication Protocol
Communication network organized in following
layers
- Physical layer mechanical, electrical details
of physical transmission of bit stream - Data-link layer frames, or fixed-length parts
of packets includes - error detection
- recovery from physical layer errors
- Network layer provides connections, routes
packets in network - address of outgoing packets
- decoding address of incoming packets
- maintaining routing information to respond to
changing load levels
21Communication Protocol
- Transport layer low-level network access and
message transfer between clients, including - partitioning messages into packets
- maintaining packet order
- flow control
- generating physical addresses
- Session layer implements sessions, or
process-to-process communications protocols - Presentation layer resolves differences in
formats among sites in network, including - character conversions
- half duplex/full duplex (echoing)
- not common in real networks (in application layer)
22Communication Protocol
- Application layer interacts directly with
users, e.g., - file transfer (ftp, http etc.)
- remote-login protocols (telnet, ssh)
- email (smtp etc.)
- schemas for distributed databases
- layer user sees relatively independent of
underlying technology
23ISO Network Model
24ISO Network Packet
25TCP/IP Protocol Layers
26Robustness
- Failure detection
- Reconfiguration
27Failure Detection
- Detecting hardware failure difficult
- To detect link failure, handshaking protocol can
be used - Assume sites A and B established link. At fixed
intervals, exchange I-am-up message indicating up
and running - If Site A doesnt receive message within fixed
interval, assumes either (a) other site not up or
(b) message lost - Site A can now send Are-you-up? message to B
- If A doesnt receive reply, can repeat message or
try alternative route to B
28Failure Detection
- If Site A doesnt ultimately receive reply from
Site B, concludes some type of failure occurred - Types of failures- site B down
- - direct link between A and B down- alternative
link from A to B down - - message lost
- A cant determine exactly why failure occurred
29Reconfiguration
- When Site A determines failure occurred, must
reconfigure system - 1. If link from A to B failed, broadcast that to
every site - 2. If site failed, every other site also
notified services offered by failed site no
longer available - When link or site available again, must again
broadcast that to all other sites
30Design Issues
- Transparency distributed system should appear
as conventional, centralized system to user - Fault tolerance distributed system should
continue to function in face of failure - Scalability as demands increase, should be easy
to add new resources to accommodate increased
demand - Cluster collection of semi-autonomous machines
that acts as single system
31Distributed File Systems
- Background
- Naming and Transparency
- Remote File Access
- Stateful vs. Stateless Service
- File Replication
32Learning Objectives
- What is distributed file system (DFS)?
- What does transparency mean in DFS?
- What do terms location transparency and location
independence mean for name mapping in DFS? - Stateful vs.stateless advantages and
disadvantages of each type of DFS
33Background
- Distributed file system (DFS) distributed
implementation of classical shared file system
multiple users share files and storage resources - DFS manages set of dispersed storage devices
- Overall storage space managed by DFS composed of
different, remotely located, smaller storage
spaces - Usually correspondence between storage spaces and
sets of files
34DFS Structure
- Service software entity running on 1 machines
providing particular type of function to a priori
unknown clients - Server service software running on a single
machine - Client process that can invoke service using
set of operations that forms its client
interface - Client interface for file service formed by set
of primitive file operations (create, delete,
read, write) - Client interface of DFS should be transparent,
i.e., not distinguish between local and remote
files
35Clients and Servers
- Client/Server a software concept
- marketing dictates selling a machine as a
server - can increase the price
- Roles can vary
- machine A running application an application
server - machine A needs a file on B, so A becomes a
client of B
36Naming and Transparency
- Naming mapping between logical and physical
objects - Multilevel mapping abstraction of file hides
details of how and where on disk file actually
stored - Transparent DFS hides location on network of
file - For file replicated on several sites, mapping
returns set of locations of files replicas - existence of multiple copies and location hidden
37Naming Structures
- Location transparency name doesnt reveal
physical storage location - file name still denotes specific, if hidden, set
of physical disk blocks - convenient way to share data
- can expose correspondence between component units
and machines if scheme breaks, e.g., machine
moves - Location independence file name does not need
to be changed when files physical storage
location changes - better file abstraction
- promotes sharing storage space itself
- separates naming hierarchy form storage-devices
hierarchy
38Naming Schemes 3 Main Approaches
- Files named by combination of host name and local
name guarantees unique systemwide name - Attach remote directories to local directories,
giving appearance of coherent directory tree
must mount remote directories to access
transparently - Total integration of component file systems
- single global name structure spans all files in
system - if server unavailable, some arbitrary set of
directories on different machines shouldnt
disappear (as e.g. in NFS)
39Remote File Access
- Reduce network traffic by caching recently
accessed disk blocks repeated accesses handled
locally - if needed data not already cached, copy brought
from server - accesses on local cached copy
- files identified with 1 master copy at server
machine, but copies of (parts of) file scattered
in different caches - cache-consistency problem keeping cached copies
consistent with master file
40Cache Location Disk vs. Memory
- Advantages of disk caches
- more reliable
- cached data on disk still there after recovery,
no need to refetch - Advantages of main-memory caches
- workstations can be diskless
- accessed faster
- memory upgrade increases speed advantage
- server caches (to speed up disk I/O) in main
memory regardless of where user caches located - main-memory caches on user machine allows single
caching mechanism for servers and users
41Cache Update Policy
- Write-through write data to disk as soon as
cache modified - reliable, cache consistency easy, but poor
performance - Delayed-write modifications written to cache
and to server later. Write accesses complete
quickly some data may be overwritten before
write-back, and so need never be written at all - poor reliability unwritten data lost whenever
user machine crashes - variation scan cache regularly, flush blocks
modified since last scan - variation write-on-close, writes data back to
server when file closed. Best for files open for
long periods, frequently modified
42Consistency
- Is locally cached data consistent with master
copy? - Client-initiated approach
- client initiates validity check
- server checks whether local data consistent with
master copy - Server-initiated approach
- server records, for each client, (parts of) files
it caches - when server detects potential inconsistency, must
react
43Stateful File Service
- Mechanism
- client opens file
- server fetches information about file from disk,
stores in its memory, gives client identifier
unique to client and open file - identifier used for subsequent accesses until
session ends - server must reclaim main-memory space used by
inactive clients - Increased performance
- fewer disk accesses
- stateful server knows if file opened for
sequential access and can read ahead next blocks
44Stateless File Server
- Avoids state information by making each request
self-contained - Each request identifies file, position in file
- No need to establish and terminate connection by
open and close operations
45Stateful vs. Stateless Service
- Failure Recovery
- stateful server loses all volatile state in crash
- restore state by recovery protocol based on
dialog with clients, or abort operations underway
when crash occurred - server must be aware of client failures to
reclaim space for record of client process states
(orphan detection and elimination) - stateless server effects of server failure and
recovery much less noticeable newly
reincarnated server can respond to self-contained
request with no difficulty
46Distinctions
- Penalties for using robust stateless service
- longer request messages
- slower request processing
- additional constraints on DFS design
- Some environments require stateful service
- server using server-initiated cache validation
cant offer stateless service records which
files cached by which clients - UNIX use of file descriptors and implicit offsets
inherently stateful servers must maintain tables
to map file descriptors to inodes, and store
current file offset
47File Replication
- Replicas of file on failure-independent machines
- Improves availability and can shorten service
time - Naming scheme maps replicated file name to 1
replica - existence of replicas should be invisible to
higher levels - replicas distinguished by different lower-level
names - Updates replicas of file denote same logical
entity update to any replica must be reflected
on all others - Demand replication reading nonlocal replica
causes it to be cached locally, thereby
generating new nonprimary replica
48Summary
- DFS ideally hides distribution of files
- Naming key issue in achieving transparency
- Statelessness simplifies fault tolerance at cost
in speed - Replication aids fault tolerance at expense of
consistency problems - NFS in wide use, but other better designs
more in CSSE4004/CSSE7014