Title: Application Layer Overlays
1Application Layer Overlays
- IS250
- Spring 2010
- John Chuang
2Application Layer Overlay
- The Internet infrastructure, based on TCP/IP,
provides - Global reachability
- Reliable end-to-end transport
- Highly successful in supporting one-to-one
(unicast) communication - But there are some limitations
- Difficult to deploy new network services (e.g.,
IP multicast, IP anycast, QoS, IPv6) - Lack of support for one-to-many (multicast) or
even many-to-many (peer-to-peer) communication - End hosts have no control over what goes on in
the network (e.g., no source routing or
user-directed routing)
3Application Layer Overlay
- One strategy build an overlay network at the
application layer - End hosts gain control over topology formation,
routing, to meet specific application needs - New applications and services can be deployed
without changes to the TCP/IP infrastructure
4Overlay Networks
- Logical topology
- Self-organized
- Dynamic
- Application specific
Application layer overlay
Network layer
5Early Examples
- Domain Name Service (DNS)
- 6bone IPv6 over IPv4
- Mbone multicast over unicast IP
- X-Bone
http//graphics.stanford.edu/papers/mbone/morepix/
world-6bone.jpeg
http//www.mbone.cl.cam.ac.uk/mbone/mbone-small.gi
f
6Some Overlay Networks
- Web Caching and Content Distribution Networks
(CDNs) - Application Layer Multicast (ALM)
- User Directed Routing
- Anonymous Routing
- Resilient overlay network
- Peer-to-Peer (P2P)
- Unstructured P2P gnutella, FreeNet, kazaa,
- Structured P2P Distributed Hash Tables (DHTs)
7Web Caching
- Improves download latency, content availability
by storing local copy of popular web objects - Web caches are L7 boxes
web server
client
8Content Delivery Networks
- Clients are intelligently redirected to nearest
CDN server to download publisher content - IP anycast (if it exists) could accomplish this
easily - In the absence of IP anycast, companies like
Akamai constructs CDNs as application layer
overlay networks
web server
CDN servers
client
9Method 1 DNS Redirect
Step 1 client queries DNS for IP address of
www.publisher.com based on clients IP
address, reconfigured publisher DNS returns IP
address of replica closest to client
publisher DNS
Local DNS
publisher
client
Nearest replica
10Method 1 DNS Redirect
Step 2 client contacts replica for object
publisher DNS
Local DNS
publisher
client
Nearest replica
11Method 2 URL Redirect
Step 1 client queries DNS for IP address of
www.publisher.com
Local DNS
publisher
client
CDN DNS
CDN server
12Method 2 URL Redirect
Step 2 client contacts publisher publisher
returns HTML with embedded objects URLs
pointing to best CDN server
Local DNS
publisher
client
CDN DNS
CDN server
13Method 2 URL Redirect
Step 3 client queries DNS for IP address of
CDN server
Local DNS
publisher
client
CDN DNS
CDN server
14Method 2 URL Redirect
Step 4 client contacts CDN server CDN server
returns embedded objs
Local DNS
publisher
client
CDN DNS
CDN server
15Some Overlay Networks
- Web Caching and Content Distribution Networks
(CDNs) - Application Layer Multicast (ALM)
- User Directed Routing
- Anonymous Routing
- Resilient overlay network
- Peer-to-Peer (P2P)
- Unstructured P2P gnutella, FreeNet, kazaa,
- Structured P2P Distributed Hash Tables (DHTs)
16IP Multicast
- Network routers must implement IP Multicast to
construct delivery tree and forward packets to
multicast group receivers
routers
server
client
17Application Layer Multicast
- End hosts self-organize to construct multicast
delivery tree messages sent using IP unicast - Sacrifice some efficiency (latency stretch) for
deployability - Various systems ESM, Overcast, Promise,
Scattercast, SplitStream, Yoid,
routers
server
client
18Some Overlay Networks
- Web Caching and Content Distribution Networks
(CDNs) - Application Layer Multicast (ALM)
- User Directed Routing
- Anonymous Routing
- Resilient overlay network
- Peer-to-Peer (P2P)
- Unstructured P2P gnutella, FreeNet, kazaa,
- Structured P2P Distributed Hash Tables (DHTs)
19IP Source Route
- IP source route allows end hosts to exercise some
degree of route control - However, many ISPs turned off IP source routing
option for security reasons
routers
server
client
default route
20User Directed Routing
- Some applications would benefit from having some
degree of control over route selection - Resiliency e.g., resilient overlay network
(RON), Detour - Anonymity onion routing, MIX-nets,
routers
server
client
21Onion Routing
- Application layer overlay for anonymous routing
- Existence of communication between Alice and Bob
not revealed to any 3rd party - Alice constructs onion where message is
successively encrypted with keys of intermediate
routing nodes - Each intermediate node peels one layer of onion
and forward to next node - Example system Tor
http//tor.eff.org/overview.html.en
22Some Overlay Networks
- Web Caching and Content Distribution Networks
(CDNs) - Application Layer Multicast (ALM)
- User Directed Routing
- Anonymous Routing
- Resilient overlay network
- Peer-to-Peer (P2P)
- Unstructured P2P gnutella, FreeNet, kazaa,
- Structured P2P Distributed Hash Tables (DHTs)
23P2P
- Self-organized overlay network to support
distributed storage, search and retrieval of
content - The killer-app free music and movies
- Individual peers contribute resources
- Content
- Network management (e.g., forwarding query
messages) - Desirable properties
- Scalability
- Performance (latency, recall)
- Robustness
- Anonymity, censorship-resistance
- Design challenges
- Dynamic membership
- Various forms of attacks
- Free-riding behavior
24P2P File-Sharing Networks
- 1st generation centralized index
- e.g., Napster
- 2nd generation decentralized indices
- e.g., Gnutella v0.4, Freenet
- 3rd generation hierarchical
- e.g., FastTrack (KaZaA, Grokster, Morpheus),
eDonkey2000, Gnutella v0.6 - 4th generation
- Structured topologies using DHTs, e.g., eMule,
Overnet, BitTorrent - Parallel downloads, e.g., BitTorrent, Avalanche
- Darknets, e.g., WASTE for small-scale F2F
networks
25Napster
- Maintains a centralized index that maps files to
machines - How to find a file
- Query the index system ? return a list of peers
that store the requested file - Transfer the file directly from peer(s)
- Advantage
- Simplicity easy to implement sophisticated
search engines on top of the index system - Disadvantage
- Single point of failure
m5
E
m6
F
D
m1 A m2 B m3 C m4 D m5 E m6 F
m4
C
A
m3
B
m1
m2
Slide adapted from Ion Stoica, Nicolas Christin
26Gnutella (v0.4)
- Flood the request
- How to find a file
- Send request to all neighbors
- Neighbors recursively propagate the request
- Eventually a machine that has the file receives
the request, and it sends back the answer - Advantages
- Totally decentralized, highly robust
- Disadvantages
- The entire network can be swamped with a request
- Can be alleviated using TTLs, but can then fail
to locate files (and still high resource usage)
m5
E
m6
F
D
m4
C
A
B
m3
m1
m2
Assume m1s neighbors are m2 and m3 m3s
neighbors are m4 and m5
Slide adapted from Ion Stoica, Nicolas Christin
27Hierarchical Networks
- Use two-level hierarchy
- Some nodes are elected as super nodes or
ultra-peers - Each ultra-peer serves as centralized index for a
portion of the network - If an ultra-peer does not know where to find an
item, query is forwarded to other ultra-peers - Advantage
- Reduce the amount of network traffic compared to
naïve flooding - Disadvantage
- Ultra-peers vulnerable to attacks
- Potential convergence problems when ultra-peers
leave abruptly - Used in FastTrack (KaZaA, Grokster, Morpheus),
eDonkey2000, Gnutella v0.6
F
E
m4
D
C
A
B
m3
m1
m2
Assume red nodes are ultra-peers
Slide adapted from Ion Stoica, Nicolas Christin
28Structured Topologies
- Gnutella and KaZaA topologies are unstructured
- Neighbor selection largely random
- No guarantee that a file can be located, even if
it exists in the network - Distributed hash tables (DHTs) offer to solve
this problem by constructing highly structured
topologies
29Distributed Hash Table (DHT)
- Applications distributed search (e.g., p2p,
CDNs, cooperative caching), application layer
overlays for multicast, anycast, etc. - Similar to traditional hash table data structure,
except data is stored in distributed peer nodes - Each node is analogous to a bucket in a hash
table - Put(), Get() interface like a regular hash table
- put(id, item)
- item get(id)
- Designed to scale to large numbers of nodes and
to handle continual node arrivals, departures, or
failures. - Various DHT designs
- CAN, Chord, Kademlia, Pastry, Tapestry, Viceroy,
etc.
30DHT Example Chord
- Associate each node and item to a unique
identifier in a one-dimensional space (0..2m) - Each node x maintains a finger table
- Fingers are neighbors
- i-th entry in finger table is the first node that
succeeds or equals x 2i - An item identified by id is stored on the
successor node of id - Properties
- Routing table size O(log(N)) , where N is the
total number of nodes - Guarantees that a file (if it exists) is found in
O(log(N)) steps
Slide adapted from Ion Stoica, Nicolas Christin
31Chord Example
- Assume m 3, i.e., an identifier space 0..7
- Node n1(1) joins
Slide adapted from Ion Stoica, Nicolas Christin
32Chord Example
- Assume m 3, i.e., an identifier space 0..7
- Node n1(1) joins
- Node n2(2) joins
Finger Table
0
i id2i succ 0 2 2 1 3 1 2 5
1
1
7
2
6
Finger Table
i id2i succ 0 3 1 1 4 1 2 6
1
3
5
4
Slide adapted from Ion Stoica, Nicolas Christin
33Chord Example
Finger Table
- Assume m 3, i.e., an identifier space 0..7
- Node n1(1) joins
- Node n2(2) joins
- Nodes n3(0), n4(6) join
i id2i succ 0 1 1 1 2 2 2 4
6
Finger Table
0
i id2i succ 0 2 2 1 3 6 2 5
6
1
7
Finger Table
i id2i succ 0 7 0 1 0 0 2 2
2
2
6
Finger Table
i id2i succ 0 3 6 1 4 6 2 6
6
3
5
4
Slide adapted from Ion Stoica, Nicolas Christin
34Insertion
Finger Table
Items
7
i id2i succ 0 1 1 1 2 2 2 4
6
- Items inserted f1(7), f2(1)
0
Finger Table
Items
1
1
7
i id2i succ 0 2 2 1 3 6 2 5
6
2
6
Finger Table
i id2i succ 0 7 0 1 0 0 2 2
2
Finger Table
i id2i succ 0 3 6 1 4 6 2 6
6
3
5
4
Slide adapted from Ion Stoica, Nicolas Christin
35Query
- Upon receiving a query for item id, a node
- Checks if item is cached locally
- If not, forwards the query to the largest node in
its successor table that does not exceed id
Finger Table
Items
7
i id2i succ 0 1 1 1 2 2 2 4
6
0
Finger Table
Items
1
1
7
i id2i succ 0 2 2 1 3 6 2 5
6
query(7)
2
6
Finger Table
i id2i succ 0 7 0 1 0 0 2 2
2
Finger Table
i id2i succ 0 3 6 1 4 6 2 6
6
3
5
4
Slide adapted from Ion Stoica, Nicolas Christin
36Summary
- Difficult to deploy new network services at
network layer - Response build overlay network at the
application layer - End hosts gain control over topology formation,
routing, to meet specific application needs - New applications and services can be deployed
without changes to the TCP/IP infrastructure - Many flavors of application layer overlay
networks - Web Caching and Content Distribution Networks
(CDNs) - Application Layer Multicast (ALM)
- Anonymous Routing (Tor)
- Resilient overlay network (RON)
- P2P file-sharing networks
- Distributed Hash Tables (DHTs)