Chapter 4 Internetworking - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 4 Internetworking

Description:

Chapter 4 Internetworking 4.1 Simple Internetworking (IP) 4.2 Routing 4.3 Global Internet 4.4 Multicast 4.5 Multiprotocol Label Switching (MPLS) 4.1 Simple ... – PowerPoint PPT presentation

Number of Views:202
Avg rating:3.0/5.0
Slides: 129
Provided by: klp8
Category:

less

Transcript and Presenter's Notes

Title: Chapter 4 Internetworking


1
Chapter 4Internetworking
  • 4.1 Simple Internetworking (IP)
  • 4.2 Routing
  • 4.3 Global Internet
  • 4.4 Multicast
  • 4.5 Multiprotocol Label Switching (MPLS)

2
4.1 Simple Internetworking (IP)
  • Best Effort Service Model
  • Global Addressing Scheme
  • ARP (Address Resolution Protocol
  • ICMP (Internet Message Control Protocol)

3
IP Internet
  • Concatenation of Networks
  • Protocol Stack

4
Service Model
  • Connectionless (datagram-based)
  • Best-effort delivery (unreliable service)
  • packets are lost
  • packets are delivered out of order
  • duplicate copies of a packet are delivered
  • packets can be delayed for a long time
  • Datagram format

5
Fragmentation and Reassembly
  • Each network has some MTU
  • Design decisions
  • fragment when necessary (MTU lt Datagram)
  • try to avoid fragmentation at source host
  • re-fragmentation is possible
  • fragments are self-contained datagrams
  • use CS-PDU (not cells) for ATM
  • delay reassembly until destination host
  • do not recover from lost fragments

6
Example
7
Global Addresses
  • Properties
  • globally unique
  • hierarchical network host
  • Dot Notation
  • 10.3.2.4
  • 128.96.33.81
  • 192.12.69.77

8
Datagram Forwarding
  • Strategy
  • every datagram contains destinations address
  • if connected to destination network, then forward
    to host
  • if not directly connected, then forward to some
    router
  • forwarding table maps network number into next
    hop
  • each host has a default router
  • each router maintains a forwarding table
  • Example (R2) Network Number
    Next Hop
  • 1 R3
  • 2 R1
  • 3 interface
    1
  • 4 interface
    0

9
Address Translation
  • Map IP addresses into physical addresses
  • destination host
  • next hop router
  • Techniques
  • encode physical address in host part of IP
    address
  • table-based
  • ARP
  • table of IP to physical address bindings
  • broadcast request if IP address not in table
  • target machine responds with its physical address
  • table entries are discarded if not refreshed

10
ARP Details
  • Request Format
  • HardwareType type of physical network (e.g.,
    Ethernet)
  • ProtocolType type of higher layer protocol
    (e.g., IP)
  • HLEN PLEN length of physical and protocol
    addresses
  • Operation request or response
  • Source/Target-Physical/Protocol addresses
  • Notes
  • table entries timeout in about 10 minutes
  • update table with source when you are the target
  • update table if already have an entry
  • do not refresh table entries upon reference

11
ARP Packet Format
12
Internet Control Message Protocol (ICMP)
  • Echo (ping)
  • Redirect (from router to source host)
  • Destination unreachable (protocol, port, or host)
  • TTL exceeded (so datagrams dont cycle forever)
  • Checksum failed
  • Reassembly failed
  • Cannot fragment

13
Redirect
G1
Network
(1)
Network
(2)
H1
G2
H2
Network
  • G2 finds that H1 is directly connected and
  • will inform H1 to redirect the IP datagrams
    to G2.

14
4.2 Routing
  • Forwarding vs Routing
  • forwarding to select an output port based on
    destination address and routing table
  • routing process by which routing table is built
  • Network as a Graph
  • Problem Find lowest cost path between two nodes
  • Factors
  • static topology
  • dynamic load

15
Distance Vector
  • Each node maintains a set of triples
  • (Destination, Cost, NextHop)
  • Directly connected neighbors exchange updates
  • periodically (on the order of several seconds)
  • whenever table changes (called triggered update)
  • Each update is a list of pairs
  • (Destination, Cost)
  • Update local table if receive a better route
  • smaller cost
  • came from next-hop
  • Refresh existing routes delete if they time out

16
Routing Table Example (Node B)
  • Destination Cost NextHop
  • A 1 A
  • C 1 C
  • D 2 C
  • E 2 A
  • F 2 A
  • G 3 A

17
Routing Loops
  • Example 1
  • F detects that link to G has failed
  • F sets distance to G to infinity and sends update
    to A
  • A sets distance to G to infinity since it uses F
    to reach G
  • A receives periodic update from C with 2-hop path
    to G
  • A sets distance to G to 3 and sends update to F
  • F decides it can reach G in 4 hops via A

18
Routing Loops
  • Example 2
  • link from A to E fails
  • A advertises distance of infinity to E
  • B and C advertise a distance of 2 to E
  • B decides it can reach E in 3 hops advertises
    this to A
  • A decides it can read E in 4 hops advertises
    this to C
  • C decides that it can reach E in 5 hops

19
Distance Vector link cost changes
  • Link cost changes
  • node detects local link cost change
  • updates routing info, recalculates distance
    vector
  • if DV changes, notify neighbors

At time t0, y detects the link-cost change,
updates its DV, and informs its neighbors. At
time t1, z receives the update from y and
updates its table. It computes a new least cost
to x and sends its neighbors its DV. At time
t2, y receives zs update and updates its
distance table. ys least costs do not change
and hence y does not send any message to z.
good news travels fast
20
Distance Vector link cost changes
good news Travels fast
Dy
algorithm terminates
Dz
21
Distance Vector link cost changes
  • Link cost changes
  • bad news travels slow - count to infinity
    problem!
  • 44 iterations before algorithm stabilizes
  • z (y) does not know that the least distance from
    y (z) to x that y (z) tells z (y) is the distance
    of the path y-z-y-x (z-y-x)

60
1
4
50
algorithm continues on!
22
Distance Vector poisoned reverse
  • If Z routes through Y to get to X
  • Z tells Y its (Zs) distance to X is infinite (so
    Y wont route to X via Z)
  • will this completely solve count to infinity
    problem?
  • Loops involving three or more nodes cannot be
    solved using the technique

60
1
4
50
algorithm terminates
23
RIP ( Routing Information Protocol)
  • Distance vector algorithm
  • Included in BSD-UNIX Distribution in 1982
  • Distance metric of hops (max 15 hops)

Source node A
24
RIP advertisements
  • Distance vectors exchanged among neighbors every
    30 sec via Response Message (also called
    advertisement)
  • Each advertisement a list of up to 25
    destination subnets within AS

25
RIP Example
z
w
x
y
A
D
B
C
Destination Network Next Router Num. of
hops to dest. w A 2 y B 2
z B 7 x -- 1 . . ....
Routing table in D
26
RIP Example
Dest Next hops w - - x -
- z C 4 . ...
Advertisement from A to D
Destination Network Next Router Num. of
hops to dest. w A 2 y B 2 z B
A 7 5 x -- 1 . . ....
Routing table in D
27
RIP Link Failure and Recovery
  • If no advertisement heard after 180 sec --gt
    neighbor or link declared dead
  • routes via neighbor invalidated
  • new advertisements sent to neighbors
  • neighbors in turn send out new advertisements (if
    tables changed)
  • link failure info quickly propagates to entire
    net
  • poison reverse used to prevent ping-pong loops
    (infinite distance 16 hops)

28
RIP Table processing
  • RIP routing tables managed by application-level
    process called route-d (daemon)
  • advertisements sent in UDP packets, periodically
    repeated

Transprt (UDP)
Transprt (UDP)
network forwarding (IP) table
network (IP)
forwarding table
link
link
physical
physical
29
Link State
  • Strategy
  • send to all nodes (not just neighbors)
    information about directly connected links (not
    entire routing table)
  • Link State Packet (LSP)
  • id of the node that created the LSP
  • cost of link to each directly connected neighbor
  • sequence number (SEQNO)
  • time-to-live (TTL) for this packet

30
Link State (cont)
  • Reliable flooding
  • store most recent LSP from each node
  • forward LSP to all nodes but one that sent it
  • generate new LSP periodically
  • increment SEQNO
  • start SEQNO at 0 when reboot
  • decrement TTL of each stored LSP
  • discard when TTL0

31
Reliable Flooding
32
Route Calculation
  • Dijkstras shortest path algorithm
  • Let
  • N denotes set of nodes in the graph
  • l (i, j) denotes non-negative cost (weight) for
    edge (i, j)
  • s denotes this node
  • M denotes the set of nodes incorporated so far
  • C(n) denotes cost of the path from s to node n
  • M s
  • for each n in N - s
  • C(n) l(s, n)
  • while (N ! M)
  • M M union w such that C(w) is the minimum
    for
  • all w in (N - M)
  • for each n in (N - M)
  • C(n) MIN(C(n), C (w) l(w, n ))

33
A Link-State Routing Algorithm
  • Dijkstras algorithm
  • net topology, link costs known to all nodes
  • accomplished via link state broadcast
  • all nodes have same info
  • computes least cost paths from one node
    (source) to all other nodes
  • gives forwarding table for that node
  • iterative after k iterations, know least cost
    path to k destinations
  • Notation
  • c(x,y) link cost from node x to y 8 if not
    direct neighbors
  • D(v) current value of cost of path from source
    to destination v
  • p(v) predecessor node along path from source to
    v
  • N' set of nodes whose least cost path
    definitively known

34
Dijsktras Algorithm
1 Initialization 2 N' u 3 for all
nodes v 4 if v adjacent to u 5
then D(v) c(u,v) 6 else D(v) 8 7 8
Loop 9 find w not in N' such that D(w) is a
minimum 10 add w to N' 11 update D(v) for
all v adjacent to w and not in N' 12
D(v) min( D(v), D(w) c(w,v) ) 13 / new
cost to v is either old cost to v or known 14
shortest path cost to w plus cost from w to v /
15 until all nodes in N'
u source node
35
Dijkstras algorithm example
D(v),p(v) 2,u 2,u 2,u
D(x),p(x) 1,u
Step 0 1 2 3 4 5
D(w),p(w) 5,u 4,x 3,y 3,y
D(y),p(y) 8 2,x
N' u ux uxy uxyv uxyvw uxyvwz
D(z),p(z) 8 8 4,y 4,y 4,y
36
Dijkstras algorithm example
37
Dijkstras algorithm example
38
Dijkstras algorithm, discussion
  • Algorithm complexity n nodes
  • each iteration need to check all nodes, w, not
    in N
  • n(n1)/2 comparisons O(n2)
  • more efficient implementations possible O(nlogn)
  • Oscillations possible
  • e.g., link cost amount of carried traffic

39
OSPF (Open Shortest Path First)
  • open publicly available defined in RFC 2328
  • Uses Link State algorithm
  • Link-State packet dissemination
  • Topology map at each node
  • Route computation using Dijkstras algorithm
  • OSPF advertisement carries one entry per neighbor
    router
  • Advertisements disseminated to entire AS (via
    flooding)
  • Carried in OSPF messages directly over IP (rather
    than TCP or UDP)

40
OSPF advanced features (not in RIP)
  • Security all OSPF messages authenticated (to
    prevent malicious intrusion)
  • Load Balancing Multiple same-cost paths allowed
    (only one path in RIP)
  • For each link, multiple cost metrics for
    different TOS (e.g., satellite link cost set
    low for best effort high for real time)
  • Integrated uni- and multicast support
  • Multicast OSPF (MOSPF) uses same topology data
    base as OSPF
  • Hierarchical OSPF in large domains.

41
Hierarchical OSPF
  • An OSPF autonomous system (AS) can be configured
    into areas
  • Exactly one OSPF area in the AS is configured to
    be the backbone area
  • Each area runs its own OSPF link-state routing
    algorithm
  • Two-level hierarchy local area, backbone.
  • Link-state advertisements only in area
  • each nodes has detailed area topology only know
    direction (shortest path) to nets in other areas.

42
Hierarchical OSPF
43
Hierarchical OSPF
  • Four types of routers
  • Internal routers perform only intra AS routing
  • Area border routers belong to both an area and
    the backbone
  • Backbone routers run OSPF routing limited to
    backbone.
  • Boundary routers connect to other ASs.

44
OSPF Advertisement Format
Header Format
Link-State Advertisement
45
Comparison of LS and DV algorithms
  • Message complexity
  • LS with n nodes, E links, O(nE) messages sent
  • DV exchange between neighbors only
  • convergence time varies
  • Speed of Convergence
  • LS O(n2) algorithm requires O(nE) messages
  • may have oscillations
  • DV convergence time varies
  • may be routing loops
  • count-to-infinity problem
  • Robustness what happens if router malfunctions?
  • LS
  • node can advertise incorrect link cost
  • each node computes only its own table
  • DV
  • DV node can advertise incorrect path cost
  • each nodes table used by others
  • error propagate thru network

46
Metrics
  • Original ARPANET metric
  • measures number of packets queued on each link
  • took neither latency or bandwidth into
    consideration
  • New ARPANET metric
  • stamp each incoming packet with its arrival time
    (AT)
  • record departure time (DT)
  • when link-level ACK arrives, compute
  • Delay (DT - AT) Transmit Latency
  • if timeout, reset DT to departure time for
    retransmission
  • link cost average delay over some time period

47
Metrics
  • Still has problems
  • Under light load, it works well since the two
    static factors of delay dominated the cost.
  • Under heavy load, a congested link would start to
    advertise a very high cost. This caused all the
    traffic to move off that link, leaving it idle,
    so then it advertise a low cost,
  • The range of link values was much too large.
  • Fine Tuning
  • compressed dynamic range
  • replaced Delay with link utilization

48
Revised ARPANET routing metric versus link
utilization
49
Revised ARPANET routing metric versus link
utilization
  • A highly loaded link never shows a cost of more
    than three times its cost when idle
  • The most expensive link is only seven times the
    cost of least expensive
  • A high-speed satellite link is more attractive
    than a low-speed terrestrial link
  • Cost is a function of link utilization only at
    moderate to high loads.

50
4.3 Global Internet Structure
  • Tree Structure of the Internet in 1990

NSFNET backbone
Stanford
ISU
BARRNET
MidNet

regional
regional
Westnet
regional
Berkeley
UNL
PARC
KU
UNM
NCAR
UA
51
Global Internet
  • One of the salient features of this topology is
    that it consists of end user sites (e.g,
    Stanford university) that connect to service
    provider networks (e.g, BARRNET)
  • Each provider and end user is likely to be an
    administratively independent entity Autonomous
    System (AS).
  • Scalability problems
  • Scalability of routing
  • Address utilization
  • Subnetting deals with address space utilization
  • Classless routing or supernetting tackles both
    address utilization and routing scalability

52
Subnetting
  • Inefficient use of Hierarchical Address Space
  • class C with 2 hosts (2/255 0.78 efficient)
  • class B with 256 hosts (256/65535 0.39
    efficient)
  • Still Too Many Networks
  • routing tables do not scale
  • route propagation protocols do not scale
  • Subnetting provides an elegantly simple way to
    reduce the total number of networks that are
    assigned
  • The idea is to take a single IP network number
    and allocate the IP addresses with that network
    number to several physical networks subnets.

53
Subnetting
  • Add another level to address/routing hierarchy
    subnet
  • Subnet masks define variable partition of host
    part
  • A single network number can be shared among
    multiple networks involves configuring all the
    nodes on each subnet with a subnet mask.
  • Subnets visible only within site

54
Subnet Example
H1 ? H2 255.255.255.128 128.96.34.139 128.96.34.12
8
R1 255.255.255.128 128.96.34.139 128.96.34.128
  • Forwarding table at router R1
  • Subnet Number Subnet Mask Next Hop
  • 128.96.34.0 255.255.255.128 interface 0
  • 128.96.34.128 255.255.255.128 interface 1
  • 128.96.33.0 255.255.255.0 R2

55
Forwarding Algorithm
  • D destination IP address
  • for each entry (SubnetNum, SubnetMask, NextHop)
  • D1 SubnetMask D
  • if D1 SubnetNum
  • if NextHop is an interface
  • deliver datagram directly to D
  • else
  • deliver datagram to NextHop
  • Use a default router if nothing matches
  • Not necessary for all 1s in subnet mask to be
    contiguous
  • Can put multiple subnets on one physical network
  • Subnets not visible from the rest of the Internet

56
Classless Routing (CIDR) Supernetting
  • CIDR Classless Inter-Domain Routing
  • A technique that addresses two scaling concerns
  • the growth of backbone routing tables, and
  • the potential for the 32-bit IP address space to
    be exhausted well before the 4 billionth host is
    attached to the Internet.
  • Even though subnetting can help to assign
    addresses carefully, it does not get around the
    fact that any AS with more than 255 hosts wants a
    class B address exhaustion of IP address space.

57
Classless Routing (CIDR) Supernetting
  • CIDR tries to balance the desire to minimize the
    number of routes that a router needs to know
    against the need to hand out addresses
    efficiently
  • Assign block of contiguous network numbers to
    nearby networks
  • Represent blocks with a single pair
  • (first_network_address, count)
  • Restrict block sizes to powers of 2
  • Use a bit mask (CIDR mask) to identify block size
  • All routers must understand CIDR addressing

58
Route aggregation with CIDR
Customers
128.112.128/24
Advertise
.
.
.
ISP
128.112.128/21
128.112.135/24
  • Since all of the customers are reachable through
    the same
  • Provider network, it can advertise a single
    route to all of
  • Them by just advertising the common 21-bit
    prefix they share

59
IP Forwarding Revisited
  • Find the network number in a packet and then
    lookup that number in a forwarding table.
  • Reexamine this assumption with CIDR
  • Prefixes length 2-32 bits
  • Prefixes may overlap
  • Some addresses may match more than one prefix.
  • Longest Prefix Matching (LPM)
  • For example
  • 171.69 (16-bit prefix)
  • 171.69.10 (24-bit prefix)
  • 171.69.10.5 matches both
  • 171.69.20.5 only matches 171.69

60
Interdomain Routing (BGP)
  • AS routing domain
  • Routing Policies
  • Two major Interdomain
  • routing protocols
  • -- Exterior gateway Protocol
  • (EGP)
  • -- Border gateway Protocol
  • (BGP-4)

61
BGP-4 Border Gateway Protocol
  • AS Types
  • stub AS has a single connection to one other AS
  • carries local traffic only
  • multihomed AS has connections to more than one
    AS
  • refuses to carry transit traffic
  • transit AS has connections to more than one AS
  • carries both transit and local traffic
  • Each AS has
  • one or more border routers
  • one BGP speaker that advertises
  • local networks
  • other reachable networks (transit AS only)
  • gives path information

62
Todays multibackbone Internet
63
BGP Example
  • Speaker for AS2 advertises reachability to P and
    Q
  • network 128.96, 192.4.153, 192.4.32, and 192.4.3,
    can be reached directly from AS2
  • Speaker for backbone advertises
  • networks 128.96, 192.4.153, 192.4.32, and 192.4.3
    can be reached along the path (AS1, AS2).
  • Speaker can cancel previously advertised paths

64
Internet inter-AS routing BGP
  • BGP (Border Gateway Protocol) the de facto
    standard
  • BGP provides each AS a means to
  • Obtain subnet reachability information from
    neighboring ASs.
  • Propagate the reachability information to all
    routers internal to the AS.
  • Determine good routes to subnets based on
    reachability information and policy.
  • Allows a subnet to advertise its existence to
    rest of the Internet I am here

65
BGP basics
  • Pairs of routers (BGP peers) exchange routing
    information over semi-permanent TCP connections
    BGP sessions
  • Note that BGP sessions do not correspond to
    physical links.
  • When AS2 advertises a prefix to AS1, AS2 is
    promising it will forward any datagrams destined
    to that prefix towards the prefix.
  • AS2 can aggregate prefixes in its advertisement

66
Aggregation of prefixes
  • 138.16.64/24
  • 138.16.65/24
  • 138.16.66/24 gt 138.16.64/22
  • 138.16.67/24

67
Distributing reachability info
  • With eBGP session between 3a and 1c, AS3 sends
    prefix reachability information to AS1.
  • 1c can then use iBGP to distribute this new
    prefix reachability information to all routers in
    AS1
  • 1b can then re-advertise the new reachability
    information to AS2 over the 1b-to-2a eBGP session
  • When router learns about a new prefix, it creates
    an entry for the prefix in its forwarding table.

68
Path attributes BGP routes
  • When advertising a prefix, advertisement includes
    BGP attributes.
  • prefix attributes route
  • Two important attributes
  • AS-PATH contains the ASs through which the
    advertisement for the prefix passed AS 67 AS 17
  • used to detect and prevent looping advertisement
  • also use in choosing among multiple path to the
    same prefix
  • NEXT-HOP Indicates the specific internal-AS
    router to next-hop AS. (There may be multiple
    links from current AS to next-hop-AS.)
  • When gateway router receives route advertisement,
    uses import policy to accept/decline.

69
BGP route selection
  • Router may learn about more than 1 route to any
    one prefix. Router must select route.
  • Elimination rules invoked sequentially until one
    route remains
  • Local preference value attribute policy decision
    ASs network administrator
  • Shortest AS-PATH
  • Closest NEXT-HOP router hot potato routing
  • Additional criteria

70
BGP messages
  • BGP messages exchanged using TCP.
  • BGP messages
  • OPEN opens TCP connection to peer and
    authenticates sender
  • UPDATE advertises new path (or withdraws old)
  • KEEPALIVE keeps connection alive in absence of
    UPDATES also ACKs OPEN request
  • NOTIFICATION reports errors in previous message
    also used to close connection

71
BGP routing policy
  • A,B,C are provider networks
  • X,W,Y are customer (of provider networks)
  • X is dual-homed attached to two networks
  • X does not want to route from B via X to C
  • .. so X will not advertise to B a route to C

72
BGP routing policy (2)
  • A advertises to B the path AW
  • B advertises to X the path BAW
  • Should B advertise to C the path BAW?
  • No way! B gets no revenue for routing CBAW
    since neither W nor C are Bs customers
  • B wants to force C to route to w via A
  • B wants to route only to/from its customers!

73
Why different Intra- and Inter-AS routing ?
  • Policy
  • Inter-AS administrator wants control over how
    its traffic routed, who routes through its net.
  • Intra-AS single admin, so no policy decisions
    needed
  • Scale
  • hierarchical routing saves table size, reduced
    update traffic
  • Performance
  • Intra-AS can focus on performance
  • Inter-AS policy may dominate over performance

74
IP Version 6
  • Features
  • 128-bit addresses (classless)
  • multicast
  • real-time service
  • authentication and security
  • autoconfiguration
  • end-to-end fragmentation
  • protocol extensions
  • Header
  • 40-byte base header
  • extension headers (fixed order, mostly fixed
    length)
  • fragmentation
  • source routing
  • authentication and security
  • other options

75
4.4 Broadcast/Multicast routing
  • Broadcast routing - deliver a packet from a
    source node to all other nodes
  • Multicast routing deliver a packet from a
    source node to a subset of other nodes

76
Source-duplication versus in-network duplication
(a) source duplication, (b) in-network duplication
77
Broadcast routing algorithms
  • Uncontrolled flooding
  • Controlled flooding
  • Spanning-tree broadcast

78
Uncontrolled flooding
  • The source node sends a copy of the packet to all
    of its neighbors
  • When a node receives a broadcast packet, it
    duplicates the packet and forwards it to all of
    its neighbors (except the neighbor from which it
    receives the packet)
  • Problems
  • If the graph has cycles, then one or more copies
    of each broadcast packet will cycle indefinitely
  • Broadcast storm

79
Controlled flooding
  • Sequence-number-controlled flooding
  • Source node puts its address and a broadcast
    sequence number into a broadcast packet
  • Each node maintains a list of the source address
    and sequence number of each packet it has
    received
  • When a node receives a broadcast packet
  • If the packet is in the list, the packet is
    dropped
  • Otherwise, the packet is duplicated and forwarded

80
Controlled flooding
  • Reverse path forwarding
  • When a router receives a broadcast packet, it
    duplicates and forwards the packet only if the
    packet arrives on the link that is on its own
    shortest unicast path back to the source

81
Controlled flooding
  • Drawback
  • Some of the nodes receive redundant packets

Ideally, every node should receive only one copy
of the broadcast packet.
82
Spanning-tree broadcast
  • Spanning tree a tree that contains all nodes in
    a graph
  • Minimum spanning tree a spanning tree whose
    cost is the minimum among all the spanning trees
    of a graph
  • Broadcast along a spanning tree

(b) Broadcast initiated at D
(a) Broadcast initiated at A
83
Construction of Spanning-tree
  • Many algorithms have been developed
  • Center-based approach
  • Select a center node (rendezvous or core)
  • Each node unicasts tree-join message to the
    center node
  1. Stepwise construction of spanning tree

(b) Constructed spanning tree
84
Multicast Routing Problem Statement
  • Goal find a tree (or trees) connecting routers
    having local multicast group members
  • tree not all paths between routers used
  • source-based different tree from each sender to
    receivers
  • shared-tree same tree used by all group members

Shared tree
85
Approaches for building multicast trees
  • source-based tree one tree per source
  • shortest path trees
  • reverse path forwarding
  • group-shared tree group uses one tree
  • minimal spanning (Steiner)
  • center-based trees

we first look at basic approaches, then specific
protocols adopting these approaches
86
Shortest Path Tree
  • multicast forwarding tree tree of shortest path
    routes from source to all receivers
  • Dijkstras algorithm

S source
LEGEND
R1
R4
router with attached group member
R2
router with no attached group member
R5
link used for forwarding, i indicates order
link added by algorithm
R3
R7
R6
87
Reverse Path Forwarding
  • rely on routers knowledge of unicast shortest
    path from it to sender
  • each router has simple forwarding behavior
  • if (multicast datagram received on incoming link
    on shortest path back to sender)
  • then flood datagram onto all outgoing links
  • else ignore datagram

88
Reverse Path Forwarding example
S source
LEGEND
R1
R4
router with attached group member
R2
router with no attached group member
R5
datagram will be forwarded
R3
R7
R6
datagram will not be forwarded
  • result is a source-specific reverse SPT
  • may be a bad choice with asymmetric links

89
Reverse Path Forwarding pruning
  • forwarding tree contains subtrees with no
    multicast group members
  • no need to forward datagrams down subtree
  • prune messages sent upstream by router with no
    downstream group members

LEGEND
S source
R1
router with attached group member
R4
router with no attached group member
R2
P
P
R5
prune message
links with multicast forwarding
P
R3
R7
R6
90
Shared-Tree Steiner Tree
  • Steiner Tree minimum cost tree connecting all
    routers with attached group members
  • problem is NP-complete
  • excellent heuristics exists
  • not used in practice
  • computational complexity
  • information about entire network needed
  • monolithic rerun whenever a router needs to
    join/leave

91
Center-based trees
  • single delivery tree shared by all
  • one router identified as center of tree
  • to join
  • edge router sends unicast join-message addressed
    to center router
  • join-message processed by intermediate routers
    and forwarded towards center
  • join-message either hits existing tree branch for
    this center, or arrives at center
  • path taken by join-message becomes new branch of
    tree for this router

92
Center-based trees an example
Suppose R6 chosen as center
LEGEND
R1
router with attached group member
R4
3
router with no attached group member
R2
2
1
R5
path order in which join messages generated
R3
1
R7
R6
93
Internet Multicasting Routing DVMRP
  • DVMRP distance vector multicast routing
    protocol, RFC1075
  • flood and prune source-based tree, reverse path
    forwarding,
  • RPF tree based on DVMRPs own routing tables
    constructed by communicating DVMRP routers
  • no assumptions about underlying unicast
  • initial datagram to multicast group flooded
    everywhere via RPF
  • routers not wanting group send upstream prune
    messages

94
DVMRP continued
  • soft state DVMRP router periodically (1 min.)
    forgets branches are pruned
  • multicast data again flows down unpruned branch
  • downstream router reprune or else continue to
    receive data
  • routers can quickly regraft to tree
  • following IGMP join at leaf
  • odds and ends
  • commonly implemented in commercial routers
  • Mbone routing done using DVMRP

95
Tunneling
  • Q How to connect islands of multicast routers
    in a sea of unicast routers?

logical topology
physical topology
  • multicast datagram encapsulated inside normal
    (non-multicast-addressed) datagram
  • normal IP datagram sent thru tunnel via regular
    IP unicast to receiving multicast router
  • receiving multicast router decapsulates to get
    multicast datagram

96
PIM Protocol Independent Multicast
  • Not dependent on any specific underlying unicast
    routing algorithm (like RIP, OSPF, works with
    all)
  • Two different multicast distribution scenarios
  • Dense
  • group members densely packed, in close
    proximity.
  • Sparse
  • of routers with group members is small wrt
    total of routers
  • group members widely dispersed

97
Consequences of Sparse-Dense Dichotomy
  • Sparse
  • no membership until routers explicitly join
  • receiver-driven construction of multicast tree
    (e.g., center-based)
  • bandwidth and non-group-router processing
    conservative
  • Dense
  • group membership by routers assumed until routers
    explicitly prune
  • data-driven construction of multicast tree (e.g.,
    RPF)
  • bandwidth and non-group-router processing
    profligate

98
PIM- Dense Mode
  • Flood-and-prune RPF, similar to DVMRP but
  • underlying unicast protocol provides RPF
    information for incoming datagram
  • less complicated (less efficient) downstream
    flood than DVMRP
  • reduces reliance on underlying routing algorithm
  • has protocol mechanism for router to detect if it
    is a leaf-node router

99
PIM - Sparse Mode
  • Center-based approach
  • router sends join message to rendezvous point
    (RP)
  • intermediate routers update state and forward
    join
  • after joining via RP, router can switch to
    source-specific tree

R1
R4
join
R2
join
R5
join
R3
R7
R6
all data multicast from rendezvous point
rendezvous point
100
PIM - Sparse Mode
  • Sender(s)
  • unicast data to RP, which distributes down
    RP-rooted tree
  • RP can extend multicast tree upstream to source
  • RP can send stop message to the source if no
    attached receivers
  • no one is listening!

R1
R4
join
R2
join
R5
join
R3
R7
R6
all data multicast from rendezvous point
rendezvous point
101
4.5 MultiProtocol Label Switching (MPLS)
  • Prior Work
  • MPLS Overview
  • MPLS Architecture

102
Prior Work
  • Tag Switching (Cisco)
  • Aggregate Route-Based IP Switching (ARIS, IBM)
  • IP Navigator
  • IFMP-IP Switching (Ipsilon)
  • Cell Switching Router (CSR, Toshiba)

103
Prior Work
  • Tag switching is based on the control-driven
    approach. The set up of LSPs (Label Switched
    Paths) closely follows control messages such as
    routing updates and RSVP messages.
  • Aggregate route-based IP switching (ARIS) is
    based on the control-driven approach. Very
    similar to tag switching. ARIS introduces the
    concept of an egress identifier (FECs) to
    express the granularity of LSPs.
  • IP Navigator is again a control-driven protocol.
    Use OSPF as the internal routing protocol used
    within a routing domain. Explicit routing is used
    to setting up the VCs.

104
Prior Work
  • Ipsilon Flow Management Protocol (IFMP) is a
    traffic driven protocol. When the number of
    packets from a flow exceeds a predetermined
    threshold, the controller uses IFMP to set up an
    LSP for the particular flow.
  • Cell switch router (CSR) proposal is similar to
    IP switching. CSR is primarily designed as a
    device for interconnecting ATM clouds. Within an
    LIS (logical IP subnet), ATM forum standards are
    used to connection hosts and switched together.
  • Multiple LISs are then interconnected with CSRs
    that are capable of running both IP forwarding
    and cell forwarding. The setup of LSPs is
    data-driven for best effort traffic and
    RSVP-driven for flows that require resource
    reservation.

105
MPLS Overview
  • RFC 3812
  • The IETF MPLS working group is to standardize a
    base technology that integrates the label
    swapping forwarding paradigm with network layer
    routing.
  • Cisco is the major contributor to the MPLS
    working group.
  • substitute Label for Tag in Tag Switching _at_
    MPLS

106
Core mechanisms of MPLS
  • Semantics assigned to a stream label
  • Labels are associated with specific streams of
    data.
  • Forwarding Methods
  • Forwarding is simplified by the use of the short
    fixed length labels to identify streams.
  • Forwarding may require simple functions such as
    looking up a label in a table, swapping labels,
    and possibly decrementing and checking a TTL.
  • Label Distribution Methods
  • Allow nodes to determine which labels to use for
    specific streams.

107
Native IP Forwarding
  • IP routing both the packet forwarding and route
    determination process in an IP network.
  • Native IP forwarding (NIF) hop-by-hop,
    destination-based packet forwarding.
  • Each packets next hop and output port are
    determined by a longest-prefix-match forwarding
    table lookup.
  • Additional packet classification may also be
    performed to derive output port queuing and
    scheduling rules.

108
A Simplified NIF forwarding engine
Longest Prefix Match lookup
Forwarding Table
Next hop port
Packet Classification
Queuing and Scheduling rules
Output Ports
Input Ports
IP Header IP payload
Packet Classification keys IP source and
destination addresses, IP protocol type,
DiffServ (DS) or TOS byte, and TCP/UDP port
numbers.
109
Per-Hop classification, queuing, and scheduling
Queue
Classify
Port 1
Port M
S
Port N
110
A Simplified LSR forwarding engine
Next hop port
Queuing and Scheduling rules
Switching Table
Output Ports
Input Ports
MPLS label MPLS payload
111
Traffic Engineering
  • Conventional IP routing attempts to find the
    shortest path between a packets current location
    and its intended destination.
  • Hot spots and packet loss rates, latency, and
    jitter increase as the average load on a router
    rises.
  • Solutions (1) Faster routers, (2) Alternate
    routes.
  • Routing policy may also require traffic
    engineering. For example, the external link
    between R6 and A3 may have been funded solely by
    A2 and A3. Therefore, A1s traffic must not be
    allowed to traverse it.

112
Traffic Engineering
-- Override the shortest path route
IP Backbone
Access 1
R1
Access 3
R6
R5
Access 2
R3
R2
R4
Route from A2 to D
Destination D
Desired route from A1 to D
Actual route from A1 to D
113
Signaling and Provisioning
  • Signaling when network (re)configuration can be
    requested by users at any time and achieved
    within milliseconds or seconds.
  • Provisioning When the reaction time for
    (re)configuration becomes measured in minutes or
    hours.
  • In either case, the (re)configuring action
    involves establishing (or modifying) information
    used by routers or switches to control their
    forwarding actions, including
  • forwarding (routing) information,
  • classification rules, and/or
  • queuing and scheduling parameters.

114
Core MPLS Components
  • The basic routing approach
  • Routing is accomplished through the use of
    standard L3 routing protocols (e.g. OSPF and
    BGP).
  • The information maintained by the L3 routing
    protocols is then used to distribute labels to
    neighboring nodes that are used in the forwarding
    of packets.
  • Labels
  • Label semantics, Label granularity, Label
    assignment, Label stack and forwarding
    operations.

115
Label Semantics
  • The label is nothing more than a shorthand for an
    aggregate stream of user data.
  • The meaning of the label is a strictly local
    issue between two neighboring nodes.
  • MPLS could be employed between any two
    neighboring nodes, even if no other nodes in the
    network participate in MPLS.
  • When MPLS is used between more than two nodes,
    then the operation between any two neighboring
    nodes could be interpreted as independent of the
    operation between any other pair of nodes.

116
Label Granularity
  • The device uses the label to forward packets
    will forward all packets with the same label in
    the same way.
  • A Forwarding Equivalence Class (FEC) is a set of
    L3 packets which are all forwarded in the same
    manner by a particular Label Switching Router
    (LSR).
  • For unicast IP traffic, the granularity of a
    label allows various levels of aggregation in a
    Label Information Base (LIB).
  • For IP multicast, the natural binding of a label
    would be to a multicast tree.

117
Label assignment
  • Label assignment involves allocating a label, and
    then binding a label to a route.
  • Label assignment can be driven by control traffic
    or data traffic. (discussed later.)
  • Label withdrawal is primarily a matter of garbage
    collection, that is collecting up unused labels
    so that they may be reassigned.

118
Routing Aggregation
R6
Access 1
4
1
R1
R5
Access 3
2
Access 2
R3
R2
5
3
R4
Destination D
119
Forwarding Component
  • Label Stack and Forwarding Operations
  • label swap looking up the incoming label to
    determine the outgoing label, encapsulation,
    port, and any additional information which may
    pertain to the stream such as a particular queue
    or other QoS related treatment.
  • label push When a packet first enters an MPLS
    domain, the packet is associated with a label.
  • label pop When a packet leaves an MPLS domain,
    the label is removed.
  • The label stack is useful within hierarchical
    routing domain.

120
Encapsulation
  • Label-based forwarding makes use of various
    pieces of information, including a label or stack
    of labels, and possibly additional information
    such as a TTL field.
  • MPLS encapsulation encapsulate the label
    information and information used for label based
    forwarding.
  • An encapsulation scheme may make use of the
    following fields
  • label, TTL, class of service, stack indicator,
    next header type indicator, and checksum

121
MPLS label stack encoding
Stack bottom
Stack top
Original Packet
Label (20 bits)
Label (20 bits)
Label (20 bits)
Exp (3 bits)
Exp (3 bits)
Exp (3 bits)
...
COS
S (1 bit)
S (1 bit)
S (1 bit)
TTL (8 bits)
TTL (8 bits)
TTL (8 bits)
MPLS frame delivered to link layer
122
Label Assignment
  • Topology driven (Tag)
  • In response to normal processing of routing
    protocol control traffic
  • Labels are pre-assigned no label setup latency
    at forwarding time.
  • Request driven (RSVP)
  • In response to normal processing of request based
    control traffic
  • May require a large number of labels to be
    assigned.
  • Traffic driven (Ipsilon)
  • The arrival of data at an LSR triggers label
    assignment and distribution.
  • Label setup latency potential for packet
    reordering.

123
Label Distribution
  • Explicit Label Distribution
  • Downstream label allocation
  • label allocation is done by the downstream LSR
  • most natural mechanism for unicast traffic
  • Upstream label allocation
  • label allocation is done by the upstream LSR
  • may be used for optimality for some multicast
    traffic
  • A unique label for an egress LSR within the MPLS
    domain
  • Any stream to a particular MPLS egress node could
    use the label of that node.

124
Label Distribution
  • Explicit Label Distribution Protocol (LDP)
  • Reliability by transport protocol or as part of
    LDP.
  • Separate routing computation and label
    distribution.
  • Piggybacking on Other Control Messages
  • Use existing routing/control protocol for
    distributing routing/control and label
    information.
  • OSPF, BGP, RSVP, PIM
  • Combine routing and label distribution.
  • Label purge mechanisms
  • By time out
  • Exchange of MPLS control packets

125
Label Distribution Protocol
  • LDP Peer
  • Two LSRs that exchange label/stream mapping
    information via LDP
  • LDP messages
  • Discovery messages (via UDP)
  • announce and maintain the presence of LSR
  • Session messages
  • maintain session between LDP peers
  • Advertisement message
  • label operation (Label distribution)
  • Notification message
  • advisory information and signal error
    information
  • Error notification signal fatal errors
  • Advisory notification status of the LDP session
    or some previous message received from the peer.

126
Label Swapping
  • Labeled Packet
  • Map the incoming label to a next hop label,
    determines where to forward the packet.
  • Encodes the new label stack into the packet, and
    then forwards it.
  • Unlabeled Packet
  • LSR analyzes the L3 header, to determine the
    packets stream.
  • Map the stream to a next hop, determines where to
    forward the packet.
  • Encodes the new label stack into the packet, and
    then forwards it.

127
Use of MPLS in a Hierarchy
128
Conclusion
  • MPLS improves the scalability of hop-by-hop
    routing and forwarding, and provides traffic
    engineering capabilities for better network
    provisioning.
  • It decouples forwarding from routing and allows
    multi-protocol support without requiring changes
    to the basic forwarding paradigm.
  • Generalized MPLS (GMPLS)
  • ?MPLS (Optical wavelength-based)
Write a Comment
User Comments (0)
About PowerShow.com