Title: Chapter 4 - Internetworking
1Chapter 4 - Internetworking
2A Simple Internetwork
(routergateway)
3Protocol Layers
4IP Service Model
- Packet Delivery Model
- Best Effort
- Global Addressing Scheme
- IP Addresses
5Packet Delivery Model
- Connectionless (datagram-based)
- Best-effort delivery (unreliable service)
- packets are lost
- packets are delivered out of order
- duplicate copies of a packet are delivered
- packets can be delayed for a long time
6Datagram format
- Version (4) currently 4
- Hlen (4) number of 32-bit words in header
- TOS (8) type of service (not widely used QoS)
- Length (16) number of bytes in this datagram
- Ident (16) used by fragmentation
- Flags/Offset (16) used by fragmentation
- TTL (8) number of hops this datagram has
travelled - Protocol (8) demux key (TCP6, UDP17)
- Checksum (16) of the header only
- DestAddr SrcAddr (32)
7Fragmentation and Reassembly
- Each network has some MTU
- Strategy
- fragment when necessary (MTU lt Datagram)
- try to avoid fragmentation at source host
- refragmentation is possible
- fragments are self-contained datagrams
- use CS-PDU (not cells) for ATM
- delay reassembly until destination host
- do not recover from lost fragments
8Example
9(No Transcript)
10Global Addresses
- Properties
- globally unique
- hierarchical network host
- Format
- Dot notation
- 10.3.2.4
- 128.96.33.81
- 192.12.69.77
7
24
0
network
host
Class A
14
16
1
0
network
host
Class B
21
8
1
1
0
network
host
Class C
11Datagram Forwarding
Strategy every datagram contains destination's
address if directly connected to destination
network, then forward to host if not directly
connected to destination network, then forward to
some router forwarding table maps network number
into next hop each host has a default router each
router maintains a forwarding table
12Example (router R2)
Network Number 1 2 3 4
Next Hop R3 R1 interface 1 interface 0
13Example
Network 1 (Ethernet)
Hn Host
Rn Router
Network Number 1 2 3 4
Next Hop R3 R1 interface 1 interface 0
H7
H8
R3
H1
H2
H3
Network 4
(point-to-point)
Network 2 (Ethernet)
R1
Interface 0
R2
Interface 1
H4
Network 3 (Token Ring)
H5
H6
14Address Translation
- Map IP addresses into physical addresses
- destination host
- next hop router
- Techniques
- encode physical address in host part of IP
address - table-based
- ARP
- table of IP to physical address bindings
- broadcast request if IP address not in table
- target machine responds with its physical address
- table entries are discarded if not refreshed
15- Request format
- HardwareType type of physical network (e.g.,
Ethernet) - ProtocolType type of higher layer protocol
(e.g., IP) - HLEN PLEN length of physical and protocol
addresses - Operation request or response
- Source/Target Physical/Protocol addresses
16Notes
- table entries timeout in about 10 minutes
- update table with source when you are the target
- update table if already have an entry
- do not refresh table entries upon reference
17DHCP
Unicast to server
DHCP
DHCP
Other networks
relay
server
Broadcast
Host
18DHCP Packet Format
19(No Transcript)
20(No Transcript)
21Internet Control Message Protocol
- Echo (ping)
- Redirect (from router to source host)
- Destination unreachable (protocol, port, or host)
- TTL exceeded (so datagrams don't cycle forever)
- Checksum failed
- Reassembly failed
- Cannot fragment
22Routing
When everything's coming your way, you're in the
wrong lane
23Routing
- Forwarding versus Routing
- forwarding to select an output port based on
destination address and routing table - routing process by which routing table is built
- Network as a Graph
24Routing
- Problem Find the lowest cost path between any
two nodes - Factors
- Static topology, SOL delays
- Dynamic load, congestion
- Scalable
25Distance Vector (RIP)
- Each node maintains a set of triples
- (Destination, Cost, NextHop)
- Each node sends updates to (and receives updates
from) its directly connected neighbors - periodically (on the order of several seconds)
- whenever its table changes (called triggered
update) - Used by Routing Information Protocol (RIP)
26Routing
- Each update is a list of pairs
- (Destination, Cost)
- Update local table if receive a better route
- smaller cost
- came from next-hop
- Refresh existing routes delete if they time out
27Example
Routing table at node B
Destination A C D E F G
Cost 1 1 inf inf inf inf
Next Hop A C
C A A
A
28Example
B
C
A
D
E
F
G
Destination A C D E F G
Cost 1 1 2 2 2 3
Next Hop A C C A A A
29Routing loops
- Example 1
- F detects that link to G has failed
- F sets distance to G to infinity and sends update
to A - A sets distance to G to infinity since it uses F
to reach G - A receives periodic update from C with 2-hop path
to G - A sets distance to G to 3 and sends update to F
- F decides it can reach G in 4 hops via A
30Example 2
- Link from A to E fails
- A advertises distance of infinity to E
- B and C advertise a distance of 2 to E
- B decides it can reach E in 3 hops through C
advertises this to A - A decides it can reach E in 4 hops through B
advertises this to C - C decides that it can reach E in 5 hops......
31Heuristics to break routing loops
- set infinity to 16
- split horizon
- Dont send routes learned from particular
neighbor to that neighbor (B doesnt send route
E,2,A to A) - split horizon with poison reverse
- Include negative information (B sends route E,?
to A) - Only work with routing loop with simple loop,
more complex techniques later
32Example 2 with poison
- Link from A to E fails
- A advertises distance of infinity to E
- B and C advertise a distance of 2 to E
- B decides it can reach E in 3 hops through C
advertises this to A, sends ? to C, - since C was using B to get to E, C advertises a
distance of ? to E - Since B was using C to get to E, B advertises a
distance of ? to E
33RIP Network
C advertizes to router A and D, cost 0 to 2,3,
cost 1 to 5,6,1 and cost 2 to 4
34RIP Packet format
35Link State (OSPF)
- Strategy Send to all nodes (not just neighbors)
information about directly connected links (not
entire routing table). - Link State Packet (LSP)
- id of the node that created the LSP
- cost of link to each directly connected neighbor
- sequence number (SEQNO)
- time-to-live (TTL) for this packet
- Example Open Shortest Path First Protocol (OSPF)
36Reliable Flooding
- store most recent LSP from each node
- forward LSP to all nodes but one that sent it
- generate new LSP periodically increment SEQNO
- start SEQNO at 0 when reboot
- decrement TTL of each stored LSP discard when
TTL0
37Reliable Flooding
38Route Calculation (Dijkstras)
39Route Calculation (in practice)
- Forward search algorithm
- Each switch maintains two lists
- Tentative and Confirmed
- Each list contains a set of triples
- (Destination, Cost, NextHop)
40- 1. Initialized Confirmed with entry for me cost
0. - 2. For the node just added to Confirmed (call it
Next) select its LSP. - 3. For each Neighbor of Next, calculate the Cost
to reach this Neighbor as the sum of the cost
from me to Next and from Next to Neighbor. - 3.1. If Neighbor is currently in neither
Confirmed or Tentative, add (Neighbor, Cost,
NextHop) to Tentative, where NextHop is the
direction to reach Next. - 3.2 If Neighbor is currently in Tentative and
Cost is less that current cost for Neighbor, then
replace current entry with (Neighbor, Cost,
NextHop), where NextHop is the direction to reach
Next. - 4. If Tentative is empty, stop. Otherwise, pick
entry from Tentative with the lowest cost, move
it to Confirmed, and return to step 2.
41Step 1. 2. 3. 4.
Confirmed (D,0,-) (D,0,-) (D,0,-) (C,2,C) (D,0
,-) (C,2,C)
Tentative (B,11,B) (C,2,C) (B,11,B) (B,5,C) (
A,12,C)
Step 5. 6. 7.
Confirmed (D,0,-) (C,2,C) (B,5,C) (D,0,-) (C,2,C)
(B,5,C) (D,0,-) (C,2,C) (B,5,C) (A,10,C)
Tentative (A,12,C) (A,10,C)
42OSPF Header
43OSPF link-state Advertizement
44Link State vs distance-vector
- Distance vector (original ARPANET)
- Node talks to direct neighbors
- Sends everything it has learned
- Link State (Later ARPANET)
- Node talks to everyone
- Only sends what it knows for sure
45Original ARPANET metric
- Practical measurements of cost
- measured number of packets enqueued on each link
- took neither latency or bandwidth into
consideration
46New ARPANET Metric
- Used link bandwidth and latency
- used delay rather than queue length for load
47New ARPANET metric
- stamp each incoming packet with its arrival time
(AT) - record departure time (DT)
- when link-level ACK arrives, compute
- Delay (DT - AT) Transmit Latency
- if timeout, reset DT to departure time for
retransmission - link cost average delay over some time period
48Problems with New metric
- under low load, static factors dominated cost
worked OK - under high load, congested links had very high
costs packets oscillated between congested and
idle links - range of costs too large preferred path of 126
lightly loaded 56Kbps links to a 1-hop 9.6Kbps
path
49Revised ARPANET metric
- replaced delay measurement with link utilization
- compressed dynamic range
- highly loaded link never has a cost more than 3
times its idle cost - most expensive link only 7 times the cost of the
least expensive - high-speed satellite link more attractive than
low-speed terrestrial link - cost is a function of link utilization only at
moderate to high loads.
50Cost Function
51Routing Characteristics
- Vern Paxson used traceroute to study 40,000
routes - Probability of encountering serious route failure
1/30 with problem lasting 30 seconds - 2/3 of routes persist for days or weeks
- 1/3 of route use different path in each
direction. - Routes becoming less predictable
52Routing for Mobile Hosts
53Global Internet
54Scalability Issues
- IP hides hosts in address hierarchy, but...
- Inefficient use of address space
- class C network with 2 hosts (2/255 0.78
efficient) - class B network with 256 hosts (256/65535 0.39
efficient) - Too many networks
- today's Internet has tens of thousands of
networks - routing tables do not scale
- route propagation protocols do not scale
55Internet Structure
56Subnetting
- Add another level to address/routing hierarchy
subnet - Subnet masks define variable partition of host
part of class A and B addresses - Subnets visible only within site
57Subnet Example
Forwarding table at router R1
Subnet Number 128.96.34.0 128.96.34.128 128.96.33.
0
Subnet Mask 255.255.255.128 255.255.255.128 255.25
5.255.0
Next Hop interface 0 interface 1 R2
58Forwarding Algorithm
- D destination IP address
- for each entry lt SubnetNum, SubnetMask, NextHopgt
- D1 SubnetMask D
- if D1 SubnetNum
- if NextHop is an interface
- deliver datagram directly to destination
- else
- deliver datagram to NextHop (a router)
59Notes
- Would use a default router if nothing matches
- Not necessary for all ones in subnet mask to be
contiguous - Can put multiple subnets on one physical network
- Subnets not visible from the rest of the Internet
60Numbers
- www.icann.org Internet Corporation for Assigned
Names and Numbers - www.arin.net is our authority and has more
details - Names and numbers have been privitized. The US
government used to allocate them
61The big picture
62Supernetting
- Assign block of contiguous network numbers to
near-by networks - Called CIDR Classless Inter-Domain Routing
- Represent blocks with a single pair
- ltfirst_network_address, countgt
- Restrict block sizes to powers of 2
- Use a bit mask (CIDR mask) to identify block size
- All routers must understand CIDR addressing
63Route Propagation
- Idea Impose a second hierarchy on the network
that limits what routers talk to each other. (The
first hierarchy is the address hierarchy that
governs how packets are forwarded.) - Autonomous System (AS)
- corresponds to an administrative domain
- examples University, company, backbone network
- assign each AS a 16-bit number
- Two-level route propagation hierarchy
- interior gateway protocol (each AS selects its
own) - exterior gateway protocol (Internet-wide standard)
64Popular Interior Gateway Protocols
- RIP Route Information Protocol
- developed at Berkeley
- distributed with Unix
- distance-vector algorithm- neighbors
- based on hop-count
- OSPF Open Shortest Path First
- recent Internet standard
- uses link-state algorithm-bcast
- supports load balancing
- supports authentication
65EGP Exterior Gateway Protocol
- Overview
- designed for tree-structured Internet
- concerned with reachability, not optimal routes
- Protocol messages
- neighbor acquisition one router requests that
another be its peer peers exchange reachability
information - neighbor reachability one router periodically
tests to see if the other router is still
reachable exchange HELLO/ACK messages uses a
k-out-of-n rule - routing updates peers periodically exchange
their routing tables (distance-vector)
66EGP Example
Exterior Neighbor (Other system)
N1
Source Net N1
G1
G2
G1
1 N2
N3
N2
G3
G2
N4
1 N3
G3
G5
G4
1 N4
N6
N5
2 N5
2 N6
67BGP-4 Border Gateway Protocol
- Assumes the Internet is an arbitrarily
interconnected set of AS's (Autonomous Systems).
Define local traffic as traffic that originates
at or terminates on nodes within an AS, and
transit traffic as traffic that passes through an
AS, we can classify AS's into three types - Stub AS an AS that has only a single connection
to one other AS. - Multihomed AS an AS that has connections to more
than one other AS, but refuses to carry transit
traffic. - Transit AS an AS that has connections to more
than one other AS, and is designed to carry both
transit and local traffic.
68Autonomous System (AS)
- Each AS has
- One or more border routers
- One BGP speaker that advertises
- local networks
- other reachable networks (transit AS only)
- gives path information
- Still pass information about every network
69BGP Example
128.96
Customer P
192.4.153
(AS 4)
Regional Provider A
AS 2
Customer Q
192.4.32
(AS 5)
192.4.3
"Backbone" Network
AS 1
Regional Provider B
Customer R
192.12.69
(AS 6)
AS 3
Customer S
192.4.54
(AS 7)
192.4.23
70BGP Example
- Speaker for AS 2 advertises reachability to P and
Q - Network 128.96, 192.4.153, 192.4.32, and 192.4.3,
can be reached directly from AS 2. - Speaker for backbone network then advertises
- Networks 128.96, 192.4.153, 192.4.32, and 192.4.3
can be reached along the path ltAS 1, AS 2gt. - Speaker can also cancel previously advertised
paths
71Domain Divided into areas
Area 3
Area 1
Area 0
R9
R7
R3
R8
R1
R4
R2
Area 2
R5
R6
72Routing Basics
- Minimize the size of routing tables
- Create Autonomous routing systems
- Simplify routing
- hierarchical routing
- Optimize within the Autonomous system
73Classless interdomain routing (CIDR)
- Give multiple class C addresses instead of class
B - BGP would require a table entry for each of these
- CIDR aggregates adjacent networks
- network number consists of a base and length
74Next Generation IP (IPv6)
75Major Features
- 128-bit addresses (1500/square foot of the
earths surface) - Multicast
- Real-time service
- Authentication and security
- Autoconfiguration
- End-to-end fragmentation
- Protocol extensions
76IPv6 Addresses
- Classless addressing/routing (similar to CIDR)
- Notation xxxxxxxx (x 16-bit hex number)
- contiguous 0s are compressed 47CDA4560124
- IPv6 compatible IPv4 address 128.42.1.87
- Address assignment
- provider-based (cant change provider easily)
- geographic
77Prefix 0000 0000 0000 0001 0000 001 0000 010 0000
011 0000 1 0001 001 010 011 100 101 110 1110 1111
0 1111 10 1111 110 1111 1110 0 1111 1110 10 1111
1110 11 1111 1111
Use Reserved Unassigned Reserved for NSAP
Allocation Reserved for IPX Allocation Unassigned
Unassigned Unassigned Unassigned Provider-Based
Unicast Address IPV4-like Unassigned Reserved for
Geographic-Based Unicast Addresses
Unassigned Unassigned Unassigned Unassigned Unass
igned Unassigned Unassigned Link Local Use
Addresses no global uniqueness Site Local Use
Addresses no global uniqueness Multicast
Addresses
Roaming
78IPv6 Header
- 40-byte base header
- Extension headers (fixed order, mostly fixed
length) - fragmentation
- source routing
- authentication and security
- other options
79Transition
- Gradual Transition with IPV4 and IPV6
- Dual Stack - (both supported on some nodes)
- Tunneling
- When v6 passes through v4 network
- Encapsulate v6 inside v4 packet with a v6 router
as a destination - destination router then sends v6 packet
- loose QoS and other desirable features in v4
segment
80Tunneling
B Z
IPV4
IPV4
B Z
B Z
B
IPV6D IPV4Z
B
IPV6C IPV4Y
IPV6B
IPV6A
81IPv6 Sockets programming
- New address family AF_INET6
- New address data type in6_addr
- New address structure sockaddr_in6
82in6_addr
- struct in6_addr
- uint8_t s6_addr16
83sockaddr_in6
- struct sockaddr_in6
- uint8_t sin6_len
- sa_family_t sin6_family
- in_port_t sin6_port
- uint32_t sin6_flowinfo
- struct in6_addr sin6_addr
84Dual Server
- In the future it will be important to create
servers that handle both IPv4 and IPv6. - The work is handled by the O.S. (which contains
protocol stacks for both v4 and v6) - automatic creation of IPv6 address from an IPv4
client (IPv4-mapped IPv6 address).
85IPv6 server
IPv4-mapped IPv6 address
TCP
Datalink
86IPv6 Clients
- If an IPv6 client specifies an IPv4 address for
the server, the kernel detects and talks IPv4 to
the server. - DNS support for IPv6 addresses can make
everything work. - gethostbyname() returns an IPv4 mapped IPv6
address for hosts that only support IPv4.
87IPv6 - IPv4 Programming
- The kernel does the work, we can assume we are
talking IPv6 to everyone! - In case we really want to know, there are some
macros that determine the type of an IPv6
address. - We can find out if we are talking to an IPv4
client or server by checking whether the address
is an IPv4 mapped address.
88Internet Multicast
89Overview
- IPv4
- class D addresses
- demonstrated with Mbone (uses tunneling)
- Place least significant 23 bits of IP number in
last 23 bits of ETH/FDDI address - MSB on in Ethernet indicates multicast
- Integral part of IPv6
- problem is making it scale
90Link-State Multicast
- Each host on a LAN periodically announces the
groups it belongs to (IGMP). - Augment update message (LSP) to include set of
groups that have members on a particular LAN. - Each router uses Dijkstra's algorithm to compute
shortest-path spanning tree for each source/group
pair. - Each router caches tree for currently active
source/group pairs.
91Example
92Distance-Vector Multicast
- Reverse Path Broadcast (RPB)
- Each router already knows that shortest path to
destination S goes through router N. - When receive multicast packet from S, forward on
all outgoing links (except the one on which the
packet arrived), iff packet arrived from N. - Eliminate duplicate broadcast packets by only
letting parent for LAN (relative to S) forward - shortest path to S (learn via distance vector)
- smallest address to break ties
93Reverse Path Multicast (RPM)
- Goal Prune networks that have no hosts in group
G - Step 1 Determine of LAN is a leaf with no
members in G - leaf if parent is only router on the LAN
- determine if any hosts are members of G using
IGMP - Step 2 Propagate no members of G here
information - augment ltDestination, Costgt update sent to
neighbors with set of groups for which this
network is interested in receiving multicast
packets. - only happens with multicast address becomes
active.
94PIM
95RP
G
RP
G
G
R3
R2
R4
RP
G
G
R1
R5
G
Host