CS 603 Review - PowerPoint PPT Presentation

About This Presentation
Title:

CS 603 Review

Description:

Language / Platform Independent. Implementation Issues: Data Conversion ... Independent Recovery. Problems with 2-PC. Blocking on failure. 3-PC as solution ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 68
Provided by: clif8
Category:

less

Transcript and Presenter's Notes

Title: CS 603 Review


1
CS 603Review
  • April 24, 2002

2
Seminar Announcements
  • Saurabh Bagchi, Hierarchical Error Detection in
    a Distributed Software Implemented Fault
    Tolerance (SIFT) Environment
  • April 25, 1030-1130, MSEE 239
  • Fabian E. Bustamante, The Active Streams
    Approach to Adaptive Distributed Systems
  • April 29, 1030-1130, CS 101

3
Review
  • Why do we want distributed systems?
  • Scaling
  • Heterogeneity
  • Geographic Distribution
  • What is a distributed system?
  • Transparency vs. Exposing Distribution
  • Hardware Basics
  • Communication Mechanisms

4
Basic Software Concepts
  • Hiding vs. Exposing
  • Distribution Distributed OS
  • Location, but not distribution Middleware
  • None Network OS
  • Concurrency Primitives
  • Semaphores
  • Monitors
  • Distributed System Models
  • Client-Server
  • Multi-Tier
  • Peer to Peer

5
Communication Mechanisms
  • Shared Memory
  • Enforcement of single-system view
  • Delayed consistency d-Common Storage
  • Message Passing
  • Reliability and its limits
  • Stream-oriented Communications
  • Remote Procedure Call
  • Remote Method Invocation

6
RPC Mechanisms
  • DCE
  • Language / Platform Independent
  • Implementation Issues
  • Data Conversion
  • Underlying Mechanisms
  • Fault Tolerance Approaches
  • Java RMI
  • SOAP
  • Interoperable
  • Language independent
  • Transport independent (anything that moves XML)

7
Naming Requirements
  • Disambiguate only
  • Access resource given the name
  • Build a name to find a resource
  • Do humans need to use name?
  • Static/Dynamic Resource
  • Performance Requirements

8
Registry Example X.500
  • Goal Global white pages
  • Lookup anyone, anywhere
  • Developed by Telecommunications Industry
  • ISO standard directory for OSI networks
  • Idea Distributed Directory
  • Application uses Directory User Agent to access a
    Directory Access Point
  • Basis for LDAP, ActiveDirectory

9
Directory Information Base(X.501)
  • Tree structure
  • Root is entire directory
  • Levels are groups
  • Country
  • Organization
  • Individual
  • Entry structure
  • Unique name
  • Build from tree
  • Attributes Type/value pairs
  • Schema enforces type rules
  • Alias entries

10
X.500
  • Directory Entry
  • Organization level CNPurdue University, LWest
    Lafayette
  • Person level CNChris Clifton, SNClifton,
    TITLEAssociate Professor
  • Directory Operations
  • Query, Modify
  • Authorization / Access control
  • To directory
  • Directory as mechanism to implement for others

11
X.500 Distributed Directory
  • Directory System Agent
  • Referrals
  • Replication
  • Cache vs. Shadow copy
  • Access control
  • Modifications at Master only
  • Consistency
  • Each entry must be internally consistent
  • DSA giving copy must identify as copy

12
Clock Synchronization
  • Definition All nodes agree on time
  • What do we mean by time?
  • What do we mean by agree?
  • Lamport Definition Events
  • Events partially ordered
  • Clock counts the order

13
Event-based definition(Lamport 78)
  • Define partial order of processes
  • A ? B A happened before B Smallest relation
    such that
  • If A and B in same process and A occurs first, A
    ? B
  • If A is sending a message and B is receipt of a
    message, A ? B
  • If A ? B and B ? C, then A ? C
  • Clock C(x) is time x occurs
  • C(x) Ci(x) where x running on node i.
  • Clocks correct if ? a,b a?b ? C(a) lt C(b)

14
Lamport Clock Implementation
  • Node i Increments Ci between any two successive
    events
  • If event a is sending of a message m from i to j,
  • m contains timestamp Tm Ci(a)
  • Upon receiving m, set Cj current Cj and gt Tm
  • Can now define total ordering. a ? b iff
  • Ci(a) lt Cj(b)
  • Ci(a) Cj(b) and Pi lt Pj

15
What if we want wall clock time?
  • Ci must run at correct rate
  • ? ? ltlt 1 such that dCi(t)/dt 1 lt ?
  • Synchronized
  • ? small e such that ? i,j Ci(t) Cj(t) lt e
  • Assume transmission time between µ and µ?
  • Algorithm Upon receiving message m,set Cj(t)
    max(Cj(t), Tmµ)
  • Theorem Assume every t seconds a message with
    unpredictable delay ? is sent over every arc.
    Then ? t t0 td, e d(2?t ?)

16
Clock SynchronizationLimits
  • Best Possible Delay Uncertainty
  • Actually e(1 1/n)
  • Synchronization with Faults
  • Faulty clock
  • Communication Failure
  • Malicious processor
  • Worst case Can only synchronize if lt 1/3
    processors faulty
  • Better if clocks can be authenticated

17
Process Synchronization
  • Problem Shared Resources
  • Model as sequential or parallel process
  • Assumes global state!
  • Alternative Mutual Exclusion when Needed
  • Coordinator approach
  • Token Passing
  • Timestamp

18
Mutual Exclusion
  • Requirements
  • Does it guarantee mutual exclusion?
  • Does it prevent starvation?
  • Is it fair?
  • Does it scale?
  • Does it handle failures?

19
Mutual ExclusionColored Ticket Algorithm
  • Goals
  • Decentralized
  • Fair
  • Fault tolerant
  • Space Efficient
  • Idea Numbered Tickets
  • Next number gets resource
  • Problem Unbounded Space
  • Solution Reissue blocks

20
Multi-ResourceMutual Exclusion
  • New Problem Deadlock
  • Processes using all resources
  • Each needs additional resource to proceed
  • Dining Philosophers Problem
  • Coordinated vs. truly distributed solutions
  • Problems with deterministic solutions
  • Probabilistic solution Lehman Rabin
  • Starvation / fairness properties

21
Distributed Transactions
  • ACID properties
  • Issues
  • Commit Protocols
  • Fault Tolerance
  • Why is this enough?
  • Failure Models and Limitations
  • Mechanisms
  • Two-phase commit
  • Three-phase commit

22
Two-Phase Commit(Lamport 76, Gray 79)
  • Central coordinator initiates protocol
  • Phase 1
  • Coordinator asks if participants can commit
  • Participants respond yes/no
  • Phase 2
  • If all votes yes, coordinator sends Commit
  • Participants respond when done
  • Blocks on failure
  • Participants must replace coordinator
  • If participant and coordinator fail, wait for
    recovery
  • While blocked, transaction must remain Isolated
  • Prevents other transactions from completing

23
Transaction Model
  • Transaction Model
  • Global Transaction State
  • Reachable State Graph
  • Local states potentially concurrent if a
    reachable global state contains both local states
  • Concurrency set C(s) is all states potentially
    concurrent with s
  • Sender set S(s) local states t t sends m and
    s can receive m
  • Failure Model
  • Site failure assumed when expected message not
    received in time
  • Independent Recovery

24
Problems with 2-PC
  • Blocking on failure
  • 3-PC as solution
  • Theorems on recovery limits
  • Independent recovery No two-site failure
  • Non-independent recovery
  • Anything short of total failure okay
  • Recovery protocol for total failure

25
3PC assuming timeout on receipt of message
Coordinator
Participant
q1
q2
start xact/ no
start xact/ yes
xact request/ start xact
abort/ -
w1
w2
no/ abort
yes/ pre-commit
pre-commit/ ack
p1
p2
ack/commit
commit/ -
26
Termination Protocol
  • If participant times out in w2 or p2
  • Elect new Coordinator
  • If coordinator alive, would have
    committed/aborted
  • New coordinator requests state of all processes.
    Termination rules
  • If any aborted, broadcast abort
  • If any committed, broadcast commit
  • If all w2, broadcast abort
  • If any p2, send pre-commit and enter state p1
  • Complete failure protocol

27
Data Replication
  • Fault Tolerance
  • Hot backup
  • Catastrophic failure
  • Performance
  • Parallelism
  • Decreased reliance on network
  • Correctness criterion Replication invisible
  • One-copy serializability (1SR)

28
Data Replication How?
  • Goal Ensure one-copy serializability
  • Write-all solution All copies identical
  • Write goes to every site
  • Read from any site
  • Standard single-copy concurrency control
  • Guarantees 1SR
  • Single-copy concurrency control gives
    serializable execution
  • Equivalent to serial execution where all writes
    happen in one transaction

29
Write All Approach
Writer
Reader
5
read
5
5
5
read
3
3
3
3
5
5
5
30
Problem Site Failure
  • Failure causes write to block
  • Must maintain locks
  • Clogs up entire system
  • Is this fault tolerance?
  • What about write all available?
  • T0 w0xA w0xB w0yC c0
  • B-fails
  • T1 r1yC w1xA c1
  • B-recovers
  • T2 r2xB w2yC c2
  • What is the serial equivalent order?

31
Write All Available FailsEven if no recovery!
32
Solutions
  • Validate availability on commit
  • Check if any failed writes now available
  • Check that all sites read or written still
    available
  • Enforces serializability for site failures
  • Doesnt work with communication failures!

33
Formalisms for Relaxed consistency
  • Goal Relaxed consistency constraints
  • Meet application needs
  • Outperform true transparent replication
  • How do we ensure constraints meet needs?
  • Formalisms to describe application needs
  • Methods to prove constraints adequate

34
Quasi-Copies(Alonso, Barbará, Garcia-Molina 90)
  • Data Caching
  • Each site keeps copy of data likely to be used
    locally
  • Propagation cost of writes high
  • User-Defined Cache
  • Controlled Divergence
  • Weak consistency constraints
  • Bounds on the differences between copies
  • User defines constraints

35
Assumptions
  • Read-only copies
  • Updates sent to master copy
  • E.g., ORACLE Materialized View
  • User Specified Coherency
  • Strict limits
  • Hints
  • Example Stock Purchase
  • Place order based on delayed price
  • Limit order to ensure price paid okay

36
Selection Conditions
  • Identification clause
  • Select/Project Query
  • Modifier Clause
  • Add / drop from cache
  • Compulsory or advisory cache
  • Static / Dynamic As new objects meet the
    identification clause, are they cached?
  • Triggering delay on dynamic

37
Coherency Conditions
  • Default (always enforced) Value was true once
  • Delay W(x,a) Max time lag
  • Version V(x) Number of updates
  • Periodic P(x) Time for refresh
  • Arithmetic A(x) Bounded Difference
  • Combine conditions with logical operators
  • Multi-object conditions
  • Consistency conditions on a group
  • Order of application in a group

38
CS 603Review
  • April 26, 2002

39
Remote Operation Mechanisms
  • Client-Server Model
  • Remote Procedure Call
  • Problem Remote Site must already know what we
    want to do!
  • Process consists of
  • Code
  • Resources (files, devices, etc.)
  • Execution (data, stack, registers, etc.)
  • Fork copies everything
  • Is this needed?
  • Solution Copy part of the process

40
So where are we?
  • Models for Remote Processing
  • Server Request documented service
  • RPC Request execution of existing procedure
  • What if operation we want isnt offered remotely?
  • Solution Agents / Code Migration

41
Types of Code Migration
From Andrew Tanenbaum, Distributed Operating
Systems, 1995.
42
Types of Code Migration
  • Weak mobility Copy only code
  • Program starts from initial state
  • Example Java applets
  • Strong mobility Copy code and execution
  • Resume execution where it stopped
  • But doesnt necessarily have same resources (less
    than fork)
  • Example DAgents (later), cluster computing
    (Condor, LSF)

43
Types of Code Migration
  • Sender Initiated
  • Receiver Initiated
  • Examples
  • Java Applets
  • Receiver Initiated
  • Cluster computing
  • Sender Initiated?
  • Central manager initiated?

44
Types of Code Migration
  • Where executed?
  • In target process
  • In new process
  • Strong Mobility Move vs. Copy
  • Migrate process Ceases at originating site
  • Clone process Two copies in parallel

45
Resource Binding
Resource to Machine Binding Resource to Machine Binding Resource to Machine Binding Resource to Machine Binding Resource to Machine Binding
Process to Resource Binding Unattached Fastened Fixed
Process to Resource Binding Identifier Move Global Reference Global Reference
Process to Resource Binding Value Copy Value Global Reference Global Reference
Process to Resource Binding Type Rebind Locally Rebind locally Rebind Locally
46
The Hard Part Resources
  • Migrated process still needs resources
  • Options to Connect to a Resource (Fugetta et al.,
    1998)
  • Binding by identifier (e.g., URL)
  • Attach to the same resource
  • Binding by value (e.g., standard libraries)
  • Bind to equivalent resource
  • Bind by type (e.g., local printer)
  • Bind to resource with same function

47
The Hard Part Resources
  • Alternative Move the Resource
  • Unattached resources (e.g., data files)
  • Relatively easy to move
  • Fastened resource (e.g., database)
  • Expensive to move
  • Fixed resource (e.g., communications endpoint)
  • Cant be moved

48
DCOM What is it?
  • Start with COM Component Object Model
  • Language-independent object interface
  • Add interprocess communication

49
DCOMDistributed COM
  • Looks like COM to the client
  • Built on DCE RPC
  • Extends to support full COM functionality

50
DCOM Architecture
51
Locating ObjectsActivation
  • CoCreateInstance(Ex)(ltCLSIDgt)
  • Interface pointer to uninitialized instance
  • Same as COM
  • CoiGetInstanceFromFile, FromStorage
  • Create new instance
  • CoGetClassObject(ltCLSIDgt)
  • Factory object that creates objects of ltCLSIDgt
  • CoGetClassObjectFromURL
  • Downloads necessary code from URL and
    instantiates
  • Can take server name as parameter
  • Or default to server specified in DCOM
    configuration on client machine
  • HKEY_CLASSES_ROOT\APPID\ltappid-guidgt
  • "RemoteServerName""ltDNS namegt
  • Also store information in ActiveDirectory

52
DCOM vs. CORBA
  • CORBA
  • Single interface name
  • Multiple inheritance
  • Dynamic Invocation Interface
  • C-style Exception Handling
  • Explicit and Implicit reference counts
  • Implemented by ORB with replaceable services
  • DCOM
  • Distinction between Class and Instance Identifier
  • Implement multiple interfaces
  • Type libraries for on-demand marshaling
  • 32 Bit Error Code
  • Explicit reference count only
  • Implemented by many independent services

53
What is .NET?
  • Language for distributed computation
  • C, VB.NET, JScript
  • Protocols
  • SOAP, HTTP
  • Run-time environment
  • Common Language Runtime (CLR)
  • ActiveDirectory
  • Web Servers (ASP.NET)

54
COM/DCOM ? .NET
  • DCOM
  • IDL
  • Name, Monikers
  • Registry / ActiveDirectory
  • C, Visual Basic
  • DCE RPC
  • DCOM Network protocol (based on DCE standards)
  • .NET
  • Web Services Description Language (WSDL)
  • DISCO (URI grammar)
  • Universal Description Discovery and Integration
    (UDDI)
  • C, VB.NET
  • SOAP
  • HTTP (presumed ubiquitous), SMTP (!?)

55
How .NET works
  • Query UDDI directory to get service location
  • Query service to get WSDL (interface
    specification)
  • Build call (XML) based on WSDL spec.
  • Make call using SOAP
  • Parse XML results based on WSDL spec.

56
JiniJava Middleware
  • Tools to construct federation
  • Multiple devices, each with Java Virtual Machine
  • Multiple services
  • Uses (doesnt replace) Java RMI
  • Adds infrastructure to support distribution
  • Registration
  • Lookup
  • Security

57
Service
  • Basic unit of JINI system
  • Members provide services
  • Federate to share access to services
  • Services combined to accomplish tasks
  • Communicate using service protocol
  • Initial set defined
  • Add more on the fly

58
InfrastructureKey Components
  • RMI
  • Basic communication model
  • Distributed Security System
  • Integrated with RMI
  • Extends JVM security model
  • Discovery/join protocol
  • How to register and advertise services
  • Lookup services
  • Returns object implementing service (really a
    local proxy)

59
Programming Model
  • Lookup
  • Leasing
  • Extends Java reference with notion of time
  • Events
  • Extends JavaBeans event model
  • Adds third-party transfer, delivery and
    timeliness guarantees, possibility of delay
  • Transaction Interfaces

60
Jini Component Categories
  • Infrastructure Base features
  • Programming Model How you use them
  • Services What you build
  • Java / Jini Comparison

61
Failure Models
  • Failure System doesnt give desired behavior
  • Component-level failure (can compensate)
  • System-level failure (incorrect result)
  • Fault Cause of failure (component-level)
  • Transient Not repeatable
  • Intermittent Repeats, but (apparently)
    independent of system operations
  • Permanent Exists until component repaired
  • Failure Model How the system behaves when it
    doesnt behave properly

62
Failure Model(Flaviu Cristian, 1991)
  • Dependency
  • Proper operation of Database depends on proper
    operation of processor, disk
  • Failure Classification
  • Type of response to failure
  • Failure semantics
  • State of system after given class of failure
  • Failure masking
  • High-level operation succeeds even if they depend
    on failed services

63
Failure Classification
  • Correct
  • In response to inputs, behaves in a manner
    consistent with the service specification
  • Omission Failure
  • Doesnt respond to input
  • Crash After first omission failure, subsequent
    requests result in omission failure
  • Timing failure (early, late)
  • Correct response, but outside required time
    window
  • Response failure
  • Value Wrong output for inputs
  • State Transition Server ends in wrong state

64
Crash Failure types(based on recovery behavior)
  • Amnesia
  • Server recovers to predefined state independent
    of operations before crash
  • Partial amnesia
  • Some part of state is as before crash, rest to
    predefined state
  • Pause
  • Recovers to state before omission failure
  • Halting
  • Never restarts

65
Failure Semantics
u
r
l
sr
f(sr)
sr
f(sr)
  • Max delay on link d Max service time p
  • Should get response in 2dp
  • Assume omission failure only
  • If no response in 2dp, resend request
  • What if performance failure possible?
  • Must distinguish between response to sr and sr

66
Failure Semantics
  • Specification for service must include
  • Failure-free (normal) semantics
  • Failure semantics (likely failure behaviors)
  • Multiple semantics
  • Combine to give (weaker) semantics
  • Arbitrary failure semantics Weakest possible
  • Choice of failure semantics
  • Is class of failure likely?
  • Probability of type of failure
  • What is the cost of failure
  • Catastrophic?

67
Failure Masking
  • Hierarchical failure masking
  • Dependency Higher level gets (at best) failure
    semantics of lower level
  • Can compensate for lower level failure to improve
    this
  • Group Failure Masking
  • Redundant servers
  • Allows failure semantics of group to be higher
    than individuals
  • k-fault tolerant
  • Group can mask k concurrent group member failures
    from client

68
Fault Tolerance
  • A distributed program A is said to tolerate
    faults from a fault class F for an invariant P
    iff there exists a predicate T for which
  • At any configuration where P holds, T also holds
    (i.e., P ? T)
  • Starting from any state where T holds, if any
    actions of A or F are executed, the resulting
    state will always be one in which T holds (i.e.,
    T is closed in A and T is closed in F)
  • Starting from any state where T holds, every
    computation that executes actions from A alone
    eventually reaches a state where P holds
  • If a program A tolerates faults from a fault
    class F for invariant P, we say that A is
    F-tolerant for P.

69
Forms of fault tolerance
Live Not live
Safe Masking Fail safe
Not safe Nonmasking none
  • For each entry, determine
  • F Fault class handled
  • T Set of states that can be reached

70
Reliable Multicast
  • Classes
  • Sender-initiated Acknowledge all packets
  • Scales poorly in normal operation
  • Receiver-initiated Request missing packets
  • Sender doesnt need receiver list
  • Scales poorly on failure (cascading failure?)
  • Tree-based, Ring-based protocols

71
Tree-based Protocols
  • Organize multicast group into tree
  • Children acknowledge to parent
  • Parent acknowledges when all children have
    acknowledged
  • Advantages
  • Sender doesnt need to know full group
  • Solves unbounded memory
  • Scalable
  • Disadvantages
  • Rate paced by slowest acknowledgement path in tree

72
Ring-based protocols
  • Idea Token site responsible for retransmit
  • Sender multicasts
  • Token site multicasts ACK
  • Receivers request retransmit from token site if
    ACK doesnt match what they have
  • Can only accept token if youve received
    everything acknowledged
  • Keep packets since last time you had token
  • Advantages
  • Space
  • Low load on sender

73
Disaster Recovery
  • Problem complete failure at single site
  • Must have multiple sites
  • Thus a distributed problem
  • Two examples
  • Distributed Storage Palladio
  • Think wide-area RAID
  • Distributed Transactions Epoch algorithm

74
Epoch Algorithm (Garcia-Molina, Polyzois, and
Hagmann 1990)
  • 1-Safe backup
  • No performance penalty
  • Multiple transaction streams
  • Use distribution to improve performance
  • Multiple Logs
  • Avoid single bottleneck

75
Algorithm Overview
  • Idea Transactions that can be committed
    together grouped into epochs
  • Primaries write marker in log
  • Must agree when safe to write marker
  • Keep track of current epoch number
  • Master broadcasts when to end epoch
  • Backups commit epoch when all backups have
    received marker

76
Correctnes Criteria
  • Atomicity If any writes of a transaction appear
    at backup, all must appear
  • If ?W(Tx, d) at backup then?W(Tx, d), W(Tx, d)
    exists at backup
  • Consistency If Ti ? Tj at primary, then
  • Local Tj installed at backup ? Ti installed at
    backup
  • Mutual If W(Ti, d) and W(Tj, d), thenW(Ti, d)
    ? W(Tj, d)
  • Minimum Divergence If Tj is at the backup and
    does not depend on a missing transaction, then it
    should be installed at the backup

77
Single-Mark Algorithm
  • Problem Is it locally safe to mark when
    broadcast received?
  • Might be in the middle of a transaction
  • Solution Share epoch at commit
  • Prepare to commit includes local epoch number
  • If received number greater than local, end epoch
  • At Backup When all sites have epoch ?n, Commit
    transactions where
  • C(Ti) ? ?n
  • P(Ti) ? ?n, local site is not coordinator, and
    coordinator has C(Ti) ? ?n

78
Test Basics
  • Mechanics Open book/notes
  • No electronic aids
  • Two questions
  • Each multi-part
  • Will include scoring suggestions
  • Underlying question Do you understand the
    material?
  • No need to regurgitate best in literature
    answer
  • Reasonable self-designed solution fine
  • Key Do you really understand your answer
  • Can you build CORRECT distributed systems?
Write a Comment
User Comments (0)
About PowerShow.com