Title: Distributed Systems
1Chapter 11 Advanced Distributed Systems
2P2P Computing
- Def 1 A class of applications that take
advantage of resources (e.g., storage, cycles,
content) available at the edge of the
Internet. - Edges often turned off, without permanent IP
addresses, etc. - Def 2 A class of decentralized,
self-organizing distributed systems, in which
all or most communication is symmetric.
(IPTPS02) - Lots of other definitions that fit in between
3Applications Computing
- Examples Seti_at_Home, UnitedDevices, Gnome_at_home,
many others - Approach suitable for a particular class of
problems. - Massive parallelism
- Low bandwidth/computation ratio
- Error tolerance, independence from solving a
particular task - Problems
- Centralized.
- How to extend the model to problems that are not
massively parallel? - Ability to operate in an environment with limited
trust and dynamic resources
4Applications File sharing
- The killer application to date
- Too many to list them all Napster, FastTrack
(KaZaA, iMesh), Gnutella (LimeWire, BearShare),
Overnet, BitTorrent, etc - Decentralized control
- Building a (relatively) reliable, data-delivery
service using a large, heterogeneous set of
unreliable components.
FastTrack (Kazaa) ,2003
5Applications Content Streaming
- Streaming the user plays the data as as it
arrives - ExamplesPplive, SplitStream, etc
6Many other P2P applications
- Backup storage (HiveNet, OceanStore)
- Collaborative environments (Groove Networks)
- Web serving communities (uServ)
- Instant messaging (Yahoo, AOL)
- Anonymous email
- Censorship-resistant publishing systems
(Ethernity, Freenet) - Spam filtering
7Client/Server vs. P2P
8Client/Server vs. P2P
9Overlay Network
10Overlay Network
- An abstract layer built on top of the physical
network - Neighbors in the overlay can be several hops away
in the physical network - Why do we need overlays?
- Flexibility in
- Choosing neighbors
- Forming and customizing topology to fit
application needs (e.g., short delay,
reliability, high BW, ) - Designing communication protocols among nodes
- Get around limitations in legacy networks
11abstract P2P overlay architecture
12Network Communications Layer
- It describes the network characteristics of
desktop machines connected over the Internet or
small wireless or sensor-based devices that are
connected in an ad-hoc manner.
13Overlay Nodes Management layer
- The Overlay Nodes Management layer covers the
management of peers, which include discovery of
peers and routing algorithms for optimization.
14Features Management layer
- The Features Management layer deals with the
security, reliability, fault resiliency and
aggregated resource availability aspects of
maintaining the robustness of P2P systems.
15Services Specific layer
- The Services Specific layer supports the
underlying P2P infrastructure and the
application-specific components through
scheduling of parallel and computation intensive
tasks, content and file management.
16Application-level layer
- The Application-level layer is concerned with
tools, applications and services that are
implemented with specific functionalities on top
of the underlying P2P overlay infrastructure.
17P2P Systems Simple Model
18 Peer Software Architecture Model
- P2P Substrate (key component)
- Overlay management
- Construction
- Maintenance (peer join/leave/fail and network
dynamics) - Resource management
- Allocation (storage)
- Discovery (routing and lookup)
- Can be classified according to the flexibility of
placing objects at peers
19P2P Substrates Classification
- Structured (or tightly controlled, DHT)
- Objects are rigidly assigned to specific peers
- Looks like as a Distributed Hash Table (DHT)
- Efficient search guarantee of finding
- Lack of partial name and keyword queries
- Maintenance overhead
- Ex Chord, CAN, Pastry, Tapestry, Kademila
(Overnet) - Unstructured (or loosely controlled)
- Objects can be anywhere
- Support partial name and keyword queries
- Inefficient search no guarantee of finding
- Some heuristics exist to enhance performance
- Ex Gnutella, Kazaa (super node), GIA
20Types of P2P Systems
21Napster (1)
- Sharing of music files
- Lists of files are uploaded to Napster server
- Queries contain various keywords of required file
- Server returns IP address of user machines having
the file - File transfer is direct
22Napster (2)
- Centralised model
- Napster server ensures correct results
- Only used for finding the location of the files
- Scalability bottleneck
- Single point of failure
- Denial of Service attacks possible
- Lawsuits
23Gnutella (1)
- Sharing of any type of files
- Decentralised search
- Queries are sent to the neighbour nodes
- Neighbours ask their own neighbours and so on
- Time To Live (TTL) field on queries
- File transfer is direct
24Gnutella Network
- Steps
- Node 2 initiates search for A
- note do not know where is A,
- Flooding.
7
1
4
2
6
3
5
25Gnutella Network
- Steps
- Node 2 initiates search for A
- Sends message to all neighbors
7
1
4
2
6
3
5
26Gnutella Network
- Steps
- Node 2 initiates search for A
- Sends message to all neighbors
- Neighbors forward message
7
1
4
2
6
3
5
27Gnutella Network
- Steps
- Node 2 initiates search for A
- Sends message to all neighbors
- Neighbors forward message
- Nodes that haveA initiate a reply message
7
1
4
2
6
3
5
28Gnutella Network
- Steps
- Node 2 initiates search for A
- Sends message to all neighbors
- Neighbors forward message
- Nodes that haveA initiate a reply message
- Query reply message is back-propagated
7
1
4
2
6
3
5
29Gnutella Network
- Steps
- Node 2 initiates search for A
- Sends message to all neighbors
- Neighbors forward message
- Nodes that haveA initiate a reply message
- Query reply message is back-propagated
- Node 2 gets replies
7
1
4
2
6
3
5
30Gnutella Network
- Steps
- Node 2 initiates search for A
- Sends message to all neighbors
- Neighbors forward message
- Nodes that haveA initiate a reply message
- Query reply message is back-propagated
- Node 2 gets replies
- File download
download A
7
1
4
2
6
3
5
31Gnutella (2)
- Decentralised model
- No single point of failure
- Less susceptible to denial of service
- SCALABILITY (flooding)
- Cannot ensure correct results
32KaZaA
- Hybrid of Napster and Gnutella
- Super-peers act as local search hubs
- Each super-peer is like a constrained Napster
server - Automatically chosen based on capacity and
availability - Lists of files are uploaded to a super-peer
- Super-peers periodically exchange file lists
- Queries are sent to super-peers
33Freenet
- Ensures anonymity
- Decentralised search
- Queries are sent to the neighbour nodes
- Neighbours ask their own neighbours and so on
- The query process is sequential
- Learning ability
34Structured P2P
- Second generation P2P (overlay) networks
- Self-organizing
- Load balanced
- Fault-tolerant
- Guarantees on numbers of hops to answer a query
- Based on a distributed hash table interface
35Distributed Hash Tables (DHT)
- Distributed version of a hash table data
structure - Stores (key, value) pairs
- The key is like a filename
- The value can be file contents
- Goal Efficiently insert/lookup/delete (key,
value) pairs - Each peer stores a subset of (key, value) pairs
in the system - Core operation Find node responsible for a key
- Map key to node
- Efficiently route insert/lookup/delete request
to this node
36DHT Generic Interface
- Node id m-bit identifier (similar to an IP
address) - Key sequence of bytes
- Value sequence of bytes
- put(key, value)
- Store (key,value) at the node responsible for
the key - value get(key)
- Retrieve value associated with key (from the
appropriate node)
37DHT Applications
- File sharing
- Databases
- Service discovery
- Chat service
- Publish/subscribe networks
38DHT Desirable Properties
- Keys mapped evenly to all nodes in the network
- Each node maintains information about only a few
other nodes - Efficient routing of messages to nodes
- Node insertion/deletion only affects a few nodes
39Chord API
- Node id m-bit identifier (similar to an IP
address) - Key m-bit identifier (hash of a sequence of
bytes) - Value sequence of bytes
- API
- insert(key, value)
- lookup(key)
- update(key, newval)
- join(n)
- leave()
40Consistent Hashing
41Chord Operation (1)
- Nodes form a circle based on node identifiers
- Each node is responsible in storing a portion of
the keys - Hash function ensures even distribution of keys
and nodes in the circle
42Chord Ring definition
- Finger table node k stores pointers to k1,
k2, k4 ..., k2m -1 (mod n) - Find node for every data in O(log(nodes)) steps
O(log(nodes)) storage per node
43Chord Operation (2)
44Chord Operation (3)
- Lookup the furthest node that precedes the key
- Query reaches target node in O(logN) hops
45Scalable Lookup Scheme
Finger Table for N8
finger k 1st node that succeeds (n2k-1)mod2m
46Lookup Using Finger Table
N1
lookup(54)
N56
N8
N51
N48
N14
N42
N38
N21
N32
47Scalable Lookup Scheme
- // ask node n to find the successor of id
- n.find_successor(id)
- if (id belongs to (n, successor)
- return successor
- else
- n0 closest preceding node(id)
- return n0.find_successor(id)
- // search the local table for the highest
predecessor of id - n.closest_preceding_node(id)
- for i m downto 1
- if (fingeri belongs to (n, id))
- return fingeri
- return n
48Chord Properties
- In a system with N nodes and K keys
- Each node manages at most K/N keys
- Bound information stored in every node
- Lookups resolved with O(logN) hops
- No delivery guarantees
- Poor network locality
49Network Locality
Nodes close on ring can be far in the network
50Grid Computing
- What is a Grid an integrated advanced cyber
infrastructure that delivers - Computing capacity
- Data capacity
- Communication capacity
- Analogy to the Electrical Power Grid
51History
- For many years, a few wacky computer scientists
have been trying to help other scientists use
distributed computing. - Interactive simulation (climate modeling)
- Very large-scale simulation and analysis (galaxy
formation, gravity waves, battlefield simulation) - Engineering (parameter studies, linked component
models) - Experimental data analysis (high-energy physics)
- Image and sensor analysis (astronomy, climate
study, ecology) - Online instrumentation (microscopes, x-ray
devices, etc.) - Remote visualization (climate studies, biology)
- Engineering (large-scale structural testing,
chemical engineering) - In these cases, the scientific problems are big
enough that they required people in several
organization to collaborate and share computing
resources, data, instruments.
52Some Core Problems
- Too hard to keep track of authentication data
(ID/password) across institutions - Too hard to monitor system and application status
across institutions - Too many ways to submit jobs
- Too many ways to store access files and data
- Too many ways to keep track of data
- Too easy to leave dangling resources lying
around (robustness)
53Challenging Applications
- The applications that Grid technology is aimed at
are not easy applications! - The reason these things havent been done before
is because people believed it was too hard to
bother trying. - If youre trying to do these things, youd better
be prepared for it to be challenging. - Grid technologies are aimed at helping to
overcome the challenges. - They solve some of the most common problems
- They encourage standard solutions that make
future interoperability easier - They were developed as parts of real projects
- In many cases, they benefit from years of lessons
from multiple applications - Ever-improving documentation, installation,
configuration, training
54Earth System Grid
- Goal address technical obstacles to the
sharing analysis of high-volume data from
advanced earth system models
55Other Examples of Grids
- TeraGrid NSF funded linking 5 major research
sites at 40 Gbs (www.teragrid.org) - European Union Data Grid grid for applications
in high energy physics, environmental science,
bioinformatics (www.eu-datagrid.org) - Access Grid collaboration systems using
commodity technologies (www.accessgrid.org) - Network for Earthquake Engineering Simulations
Grid - grid for earthquake engineering
(www.nees.org)
56Current Status of the Grid
- Dozens of Grid projects in scientific and
technical computing in academic research
community - Consensus on Key concepts and technologies (GGF
Global Grid Forum) - Open source Globus Toolkit a standard for major
protocols and services - Funding agencies funding a lot of grid projects
- Business interest emerging rapidly
- Standards still emerging grid services, web
services resource framework - Requires significant user training
57Use Grid Now
- A lot of work to make applications grid-ready
- adopt new algorithms for parallel computation
- change user interface
- Have to build application on different
architectures - Need to move application and data to different
computers - Security and Licensing issues
- Requires a lot of system administration expertise
- Largely UNIX-based
58Software Layers
Web browser or command window
(User interface)
Globus Client on Users Workstation
(Certs, submit job)
Globus Server on Master Node
(Job manager)
Queue Managers and Schedulers on Master Node
Applications Running on Grid Clusters
59Developing Grid Standards
Increased functionality, standardization
60Sand Glass Model
- Trying to force homogeneity on users is futile.
Everyone has their own preferences, sometimes
even dogma. - The Internet provides the model
61Evolution of the Grid
App-specific Services
Open Grid Services Arch
Increased functionality, standardization
Web services
GGF OGSI, WSRF, (leveraging OASIS, W3C,
IETF) Multiple implementations, including Globus
Toolkit
X.509, LDAP, FTP,
Globus Toolkit
Defacto standards GGF GridFTP, GSI (leveraging
IETF)
Custom solutions
Time
62Open Grid Services Architecture
- Define a service-oriented architecture
- the key to effective virtualization
- to address vital Grid requirements
- utility, on-demand, system management,
collaborative computing, etc. - building on Web service standards.
- extending those standards when needed
63Grid and Web Services Convergence
- The definition of WSRF means that the Grid and
Web services communities can move forward on a
common base.
64Who Is the Grid For?
- Any Grid (distributed/collaborative) application
or system will involve several classes of
people. - End users (e.g., Scientists, Engineers,
Customers) - Application/Product Developers
- System Administrators
- System Architects and Integrators
- Each user class has unique skills and unique
requirements. - The user class whose needs are met varies from
tool to tool (even within the Globus Toolkit).
65What End Users Need
Secure, reliable, on-demand access to
data, software, people, and other
resources (ideally all via a Web Browser!)
66General Architecture
67Grid Community Software
68Social Policies/Procedures
- How will people use the system?
- Who will set up access control?
- Who creates the data?
- How will computational resources be added to the
system? - How will simulation capabilities be used?
- What will accounting data be used for?
- Not all problems are solved by technology!
- Understanding how the system will be used is
important for narrowing the requirements.
69What Is the Globus Toolkit?
- The Globus Toolkit is a collection of solutions
to problems that frequently come up when trying
to build collaborative distributed applications. - Heterogeneity
- To date (v1.0 - v4.0), the Toolkit has focused on
simplifying heterogenity for application
developers. - We aspire to include more vertical solutions in
future versions. - Standards
- Our goal has been to capitalize on and encourage
use of existing standards (IETF, W3C, OASIS,
GGF). - The Toolkit also includes reference
implementations of new/proposed standards in
these organizations.
70What Does the Globus Toolkit Cover?
71Globus Toolkit Components
72Comparisons of P2P and Grid