Title: DNS and CDNs Content Distribution Networks
1DNS and CDNs (Content Distribution Networks)
- Paul Francis
- Cornell Computer Science
2What do all of these have in common?
- http//www.cnn.com/news/story.html
- HTTP (web)
- mailto//francis_at_cs.cornell.edu
- Email
- sip//service_at_phone.verizon.com
- SIP (Session Initiation Protocol)
3They all have a DNS name somewhere
- http//www.cnn.com/news/story.html
- HTTP (web)
- francis_at_cs.cornell.edu
- Email
- sip//service_at_phone.verizon.com
- SIP (Session Initiation Protocol)
4Why is DNS so important?
- Names are easier to remember than IP addresses
- paul_at_129.48.55.233 ???
- And in any event, IP addresses are not
dependable - They change often (dialup)
- They are not all unique
5DNS is the core of the Internet
- So we (humans, and applications) like to deal
with dependable, stable, friendly DNS names - The names get mapped into IP addresses by lower
layers - By the Domain Name System (DNS)
- Then the learned IP address is put into packets,
and IP routing gets the packets across the
Internet
6Picture of DNS query/reply
7Why all these dots?
- Why falcon.cs.cornell.edu?
- Why not cornell-falcon or something?
8It wasnt always that way
- Twenty years ago, this was a valid email address
- george_at_isi
- How did my computer learn the IP address of isi?
9The host table and DNS
- Before DNS, there was the host table
- This was a complete list of all the hosts in the
Internet! - It was copied every night to every machine on the
Internet! - At some point, this was perceived as a potential
scaling bottleneck - So a distributed directory called the Domain
Name System was invented (DNS)
10The host table (historic)
11Distributed Directory
- A primary goal of DNS was to have a distributed
host table, so that each site could manage its
own name-to-address mapping - But also, it should scale well!
12DNS is simple but powerful
- Only one type of query
- Query(domain name, RR type)
- Resource Record (RR) type is like an attribute
type - Answer(values, additional RRs)
- Limited number of RR types
- Hard to make new RR types
- Not for technical reasons
- Rather because each requires global agreement
13DNS is the core of the Internet
- Global name space
- Can be the core of a naming or identifying scheme
- Global directory service
- Can resolve a name to nearly every computer on
the planet
14Important DNS RR types
- NS Points to next Name Server down the tree
- A Contains the IP address
- AAAA for IPv6
- MX Contains the name of the mail server
- Service-oriented RR types
- SRV Contains addresses and ports of services on
servers - One way to learn what port number to use
- NAPTR Essentially a generalized mapping from one
name space (i.e. phone numbers) to another (i.e.
SIP URL)
15DNS tree structure
NS RR pointers
.
edu.
com.
jp.
us.
cornell.edu.
cmu.edu.
mit.edu.
cs.cornell.edu.
eng.cornell.edu.
foo.cs.cornell.edu A 10.1.1.1 bar.cs.cornell.edu
A 10.1.1.1
16Primary and secondary servers
cornell.edu.
NS RRs point to both primary and secondary servers
cs.cornell.edu.
Primary server replicates RRs onto secondary
servers periodically (updates are incremental)
RRs are initially configured into primary server
17Resolver structure and configuration
Stub resolver resides on client host, points to
configured recursive server
Static configuration of root servers
.
edu.
com.
jp.
cornell.edu.
cmu.edu.
Resolver manages DNS queries on behalf of stub
resolvers
cs.cornell.edu.
eng.cornell.edu.
18Resolver structure and configuration
1. Stub resolver sends recursive query
.
2,3,4 Resolver makes iterative queries to
servers
edu.
com.
jp.
cornell.edu.
cmu.edu.
N. Resolver returns final answer to stub
resolver (which also caches result)
Resolver caches results for efficiency
cs.cornell.edu.
eng.cornell.edu.
19DNS cache management
- All RRs have Time-to-live (TTL) values
- When TTL expires, cache entries are removed
- NS RRs tend to have long TTLs
- Cached for a long time
- Reduces load on higher level servers
- A RRs may have very short TTLs
- Order one minute for some web services
- Order one day for typical hosts
20Caching is the key to performance
- Without caching, the small number of machines at
the top of the hierarchy would be overwhelmed - But what if you want to change the IP address of
a host? How do you change all those cached
entries around the world? - You cantyou wait until they timeout on their
own, then make your change
21Changing a DNS name
- Say your TTL was set to one day
- This means that even if you change DNS now, some
hosts will continue to use the old address for a
day - So, give the host two IP addresses for a while
(the old one and the new one) - But DNS only answers with the new one
- After a day, the old one is cleaned out of
caches, and you can remove it from the host
22DNS Issues
- DoS attacks on (13) root servers
- DoS Denial of Service
- Mis-configuration issues
- But on the whole DNS is an incredible system, and
is in many important respects is the core of
the Internet - http//www.cnn.com/news
- francis_at_cs.cornell.edu
23Next, Content Distribution Networks
- Idea here is to replicate a web server in many
places over the Internet - Latency to a single centralized web server farm
may be too high - A centralized web server farm may fail
24Content Routing Principle(a.k.a. Content
Distribution Network)
Hosting Center
Hosting Center
Backbone ISP
Backbone ISP
Backbone ISP
IX
IX
Site
ISP
ISP
ISP
S
S
S
Sites
S
S
S
S
S
S
25Content Routing Principle(a.k.a. Content
Distribution Network)
Hosting Center
Hosting Center
Content Origin here at Origin Server
OS
Backbone ISP
Backbone ISP
Backbone ISP
Content Servers distributed throughout the
Internet
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
26Content Routing Principle(a.k.a. Content
Distribution Network)
Hosting Center
Hosting Center
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Content is served from content servers nearer to
the client
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
27Two basic types of CDN cached and pushed
Hosting Center
Hosting Center
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
28Cached CDN
Hosting Center
Hosting Center
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
29Cached CDN
Hosting Center
Hosting Center
- Client requests content.
- CS checks cache, if miss gets content from origin
server.
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
30Cached CDN
Hosting Center
Hosting Center
- Client requests content.
- CS checks cache, if miss gets content from origin
server. - CS caches content, delivers to client.
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
31Cached CDN
Hosting Center
Hosting Center
- Client requests content.
- CS checks cache, if miss gets content from origin
server. - CS caches content, delivers to client.
- Delivers content out of cache on subsequent
requests.
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
32Pushed CDN
Hosting Center
Hosting Center
- Origin Server pushes content out to all CSs.
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
33Pushed CDN
Hosting Center
Hosting Center
- Origin Server pushes content out to all CSs.
- Request served from CSs.
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
34CDN benefits
- Content served closer to client
- Less latency, better performance
- Load spread over multiple distributed CSs
- More robust (to ISP failure as well as other
failures) - Handle flashes better (load spread over ISPs)
- But well-connected, replicated Hosting Centers
can do this too
35CDN costs and limitations
- Cached CDNs cant deal with dynamic/personalized
content - More and more content is dynamic
- Classic CDNs limited to images
- Managing content distribution is non-trivial
- Tension between content lifetimes and cache
performance - Dynamic cache invalidation
- Keeping pushed content synchronized and current
36What if lots of clients try to access the same CS?
Hosting Center
Hosting Center
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
C
C
C
C
37How can the CDN spread this load around?
Hosting Center
Hosting Center
OS
Backbone ISP
Backbone ISP
Backbone ISP
CS
CS
CS
IX
IX
Site
ISP
CS
ISP
ISP
CS
S
S
S
Sites
S
S
S
S
S
S
C
C
C
C
C
C
38Guess what DNS!
- Smart DNS server monitors load on the content
servers - When it answers a DNS request, it picks a server
that is not overloaded (and near the client) - The DNS answer has a small TTL (30 seconds one
minute) - Small TTL allows the DNS load balancer to make
fine-grained load decisions - Can quickly offload a busy or even crashed
content server
39How well do CDNs work?
- Hard to say
- Some evidence suggests they are not so good a
picking nearby servers - Internet bandwidth is improving, so not as
important to pick nearby servers - Central hosting centers are easier to manage, and
perform increasingly well - In fact, Akamai is beginning to find it difficult
to justify its service!