Title: VPLS
1VPLS
- Yaakov (J) Stein November 2004
- Chief Scientist
- RAD Data Communications
2Contents
- Tunneling Ethernet
- VPNs
- MPLS and PWs
- L2VPNs
- LDP vs. BGP
- Generalizations
- L3VPNs
3Tunneling Ethernet
4Ethernet limitations
- Ethernet LAN is the most popular LAN
- but Ethernet can not be made into a WAN
- Ethernet is limited in distance between stations
- Ethernet is limited in number of stations on
segment - Ethernet is inefficient in finding destination
address - Ethernet only prunes network topology, does not
route - so the architecture that has emerged is Ethernet
private networks - connected by public networks of other types (e.g.
IP)
LAN
LAN
WAN
5Traditional WAN architecture
- this model is sensible when traffic contains a
given higher layer - Ethernet header is removed at ingress and a new
header added at egress - this model is not transparent Ethernet LAN
interconnect - Ethernet LANs with multiple higher layer packet
types - (e.g. IPv4, IPv6, IPX, SNA, CLNP, etc.) cant be
interconnected - raw L2 Ethernet frames can not be sent
- the Ethernet layer is terminated at WAN ingress
- the traffic is no longer Ethernet at all
6Tunneling Ethernet frames
- users with multiple sites want to connect their
LANs - so that all locations appear to be on the same
LAN - this requires tunneling of all Ethernet L2 frames
(not only IP) - between one LAN and another
- the entire Ethernet frame needs to be preserved
- (except perhaps the FCS which can be regenerated
at egress)
7Tunneling encapsulation
- for simplicity, lets think of an IP network
- the traditional architecture uses the following
packet formats - the VPN model (Ether-IP) uses the following
packet formats
WAN
8Ethernet over HDLC/FR/ATM
- Ethernet frames can be carried over various WANs
- HDLC not standardized, Cisco-HDLC
- FR RFC2427 / STD0055 (ex 1490)
- ATM RFC2684 / (ex 1483), LANE
- entire Ethernet frame (or IP packet) is used as
payload
9Ethernet over SONET/SDH
SONET/SDH
- Ethernet over SONET/SDH (EoS) and low-rate TDM
- entire Ethernet frame is placed in SONET/SDH
payload - Formats
- Generic Framing Procedure (GFP) SDHOTN
G.7041, PDH - G.8040 - Virtual Concatenation (VC) with/without Link
Capacity Adjustment Scheme (LCAS) - Link Access Procedure for SDH (LAPS)
- unlike POS, EoS allows bandwidth sharing between
Ethernet ports - but SONET/SDH is an expensive infrastructure
10VPNs
11Virtual Private Networks
service provider network
- Service Providers (SPs) with packet switched
networks (PSNs) - want to offer customers site interconnect service
- since the private networks are interconnected
over - a public PSN
- this results in a Virtual Private Network
- unlike the traditional WAN architecture
- the entire Ethernet frame must be tunneled
through the PSN - hence it is sometimes called Transparent LAN
Service (TLS)
12Basic (L2,L3)VPN model
emulated link
AC Attachment Circuit
AC Attachment Circuit
13(L2,L3)VPN in more detail
CE
customer 1 network
customer 2 network
provider network
CE
Key C Customer router/switch CE Customer
Edge router/switch P Provider
router/switch PE Provider Edge router/switch
customer 1 network
customer 2 network
14VPN Challenges
192.115.243.79
192.115.243.19
SP network
192.115.243.19
- Security
- Private IP addresses
- Multiple higher-layer protocols
- SP resource requirements
- Complex provider - customer relationship
15VPN types
- Legacy
- proprietary leased-line (not virtual)
- Frame Relay over E1/T1
- ATM over E1 or multiple-E1
- Pure IP
- IPSec tunnel
- L2TP tunnel
- MPLS L3VPN
- 2547bis
- MPLS L2VPN
- VPWS / VPLS
16MPLS and PWs
17What is MPLS?
- the Internet core is now mostly MPLS
- label switching adds the strength of CO to CL
forwarding - label switching has three stages
- routing (topology determination) using L3 (IP)
protocols - path setup (label binding and distribution)
- data forwarding
- label switching
- speeds up forwarding
- decreases forwarding table size (by using local
labels) - load balance by explicitly setting up paths
- complete separation of routing and forwarding
algorithms - so new routing algorithm needed
- but new signaling algorithm may be needed
- MultiProtocol Label Switching (MPLS)
- is multiprotocol - from above and below
- can run on IP router or ATM switch with only SW
upgrade (but HW helps) - supports a label stack
- support for traffic engineering and QoS guarantees
18MPLS Architecture
- label switching is needed in the core, access can
be L3 forwarding - core interfaces the access at the edge (ingress,
egress) - LSR router that can perform label switching
- LER LSR with non-MPLS neighbors (LSR at edge of
core network) - LSP unidirectional path used by label switched
forwarding (ingress to egress) - not every packet needs label switching (e.g.
only small number of packets, no QoS)
1.3
19Where is the MPLS label?
- unlike TCP, the CO layer lies under the CL layer
- if there is a broadcast L2 (e.g. Ethernet), the
CO layer lies above it - hence, MPLS switching is sometime called layer
2.5 switching
20MPLS Shim Header
- when a shim header is needed, its format should
be - Label there are 220 different labels ( 220
multicast labels) - Exp (CoS) left undefined by IETF WG
- was CoS in Cisco Tag Switching
- could influence packet queuing
- Stack bit S1 indicates bottom of label stack
- TTL decrementing hop count
- used to eliminate infinite routing loops
- generally copied from/to IP TTL field
- Special (reserved) labels
- 0 IPv4 explicit null
- 1 router alert
- 2 IPv6 explicit null
- 3 implicit null
21How are labels used?
- binding
- label assigned by downstream LSR
- per port or per LSR label space
- control driven vs. data driven (traffic driven)
- distribution
- upstream label distribution
- piggyback label distribution on routing protocols
(e.g. BGP) - Label Distribution Protocol (LDP)
- forwarding
- read top label L
- consult Incoming Label Map (forwarding table)
- perform label stack operation (pop L, swap L - M,
swap L - M and push N) - forward based on Ls Next Hop Label Forwarding
Entry
22MPLS solves IP address problem
192.115.243.19
SP network
192.115.243.19
- assume customers 1 and 2 use overlapping IP
addresses - then C-routers have inconsistent tables
- ingress PE-router pushes a label
- P-routers see only MPLS label
- P-routers dont see IP addresses - no ambiguity
- P-routers see only the MPLS label - not LAN IP
addresses - PE routers know how to map CE LANs
23Simple MPLS LAN Extension
ACs
ACs
- each LAN mapped to pair of (unidirectional) LSPs
- supports all LAN traffic types (CE is Ethernet
Switch, not IP router) - each Ethernet frame encapsulated with MPLS label
- supports various AC technologies
- scaling problem
- requires large number of LSPs
- P-routers need to reserve resources for each LAN
instance
24(Martini) Pseudowires
- transport MPLS tunnel set up between PEs
- multiple PWs may be set up inside tunnel
- Ethernet frame encapsulated with 2 labels
- P-routers do not reserve resources for each VPN
instance
25More on Pseudowires
- encapsulation via Martini drafts
draft-ietf-pwe3-xxx-encap - L2 can be Ethernet, but also ATM or FR
- setup via PW control protocol draft-ietf-pwe3-cont
rol-protocol - based on targeted LDP
- Problems
- supports only point-to-point LAN interconnect
(VPWS) - need to manually configure PW for every VPN
instance - need to setup 2 unidirectional tunnels for every
pair of PEs
26Ethernet Pseudowire packet
- outer label specifies MPLS tunnel
- inner label contains PW label to support
- multiple Ethernet PWs in a single MPLS tunnel
- optional control word
- enables detection of out-of-order and lost
packets - Ethernet Frame
- by default no FCS trailer (but there is separate
FCS retention draft)
27L2VPNs
28VPWS
AC
AC
provider network
- Virtual Private Wire Service is a L2
point-to-point service - it emulates a wire supporting the Ethernet
physical layer - set up MPLS tunnel between PEs
- set up Ethernet PW inside tunnel
- CEs appear to be connected by a single L2 circuit
- (can also make VPWS for ATM, FR, etc.)
29VPLS
- VPLS emulates a LAN over an MPLS network
- set up MPLS tunnel between every pair of PEs
(full mesh) - set up Ethernet PW inside tunnels, for each VPN
instance - CEs appear to be connected by a single LAN
- PE must know where to send Ethernet frames
- but this is what an Ethernet bridge does
30VPLS
V B
B V
V B
- a VPLS-enabled PE has, in addition to its MPLS
functions - VPLS code module (IETF drafts)
- Bridging module (standard IEEE 802.1D learning
bridge) - SP network (inside rectangle) looks like a single
Ethernet bridge! - Note if CE is a router, then PE only sees 1 MAC
per customer location
31VPLS bridge
- PE maintains a separate bridging module for each
VPN (VPLS instance) - VPLS bridging module must perform
- MAC learning
- MAC aging
- flooding of unknown MAC frames
- replication (for unknown/multicast/broadcast
frames) - unlike true bridge, Spanning Tree Protocol is not
used - limited traffic engineering capabilities
- scalability limitations
- slow convergence
- forwarding loops are avoided by split horizon
- PE never forwards packet from MPLS network to
another PE - not a limitation since there is a full mesh of
PWs - so always send directly to the right PE
32VPLS code module
- VPLS signaling
- establish PWs between PEs per VPLS
- VPLS autodiscovery
- locates PEs participating in VPLS instance
- obtain frame from bridge
- encapsulate Ethernet frames
- and inject packet into PW
- retrieve packet from PW
- removes PW encapsulation
- and forward Ethernet frame to bridge
33L2VPN vs. L3VPN
PE
?
PE
PE
- in L2VPN CEs appear to be connected by single L2
network - PEs are transparent to L3 routing protocols
- CEs are routing peers
- in L3VPN CE routers appear to be connected by a
single L3 network - CE is routing peer of PE, not remote CE
- PE maintains routing table for each VPN
34IPLS (IP-only LAN Service)
- mechanisms may be simplified if Ethernet frames
carry only IP traffic - enables upgrade of IP routers to support
VPLS-like services - in this case CE devices are routers, not switches
- frames are still forwarded based on MAC DA (not
L3VPN) - but MAC forwarding tables updated via PW
signaling, not 802.1D - PE snoops IP and ARP frames to discover CEs
connected to it - creates (AC,VPN-ID,IP-addr,MAC-addr) entry
- creates PWs to all PEs participating in VPN-ID
- sends entries to these PEs
- Address Resolution Protocol (ARP) messages are
proxied - rather than being carried transparently
- PE searches entries it has received
- can support different AC types (Ethernet and FR)
- ARP Mediation ensures proper mapping
35LDP vs. BGP
36LDP vs. BGP
- both use TCP for reliable transport (LDP uses UDP
for hellos) - both are hard-state protocols
- both use TLV format for parameters
BGP multiprotocol (IPv4, IPv6, IPX, MPLS) highly
complex protocol provides routing / label
distribution built-in autodiscovery mechanism
- LDP
- MPLS only
- simpler protocol
- only label distribution
- extendable for autodiscovery
37BGP
header (19B)
marker (16B)
length (2B)
type (1B)
data (variable)
- marker can be used for authentication (TCP MD5
signature) - length is total BGP PDU length, including header
- type
- OPEN (for session initialization)
- UPDATE (add, change and withdraw routes)
- NOTIFICATION (return error messages, terminate
session) - KEEPALIVE (heartbeat)
- KEEPALIVE packet consists of 19B header only
38BGP state machine
- idle no session (awaiting session
initialization) - connect attempting to connect to peer
- active started TCP 3-way handshake (router
busy) - open sent have sent OPEN message
- open confirm after receiving TCP SYN for OPEN
message - established BGP session up and running
39BGP OPEN
version (1B)
my AS (2B)
hold time (2B)
opt parameters (variable)
BGP-ID (2B)
op len (1B)
- version (3 or 4)
- my AS identifier of autonomous system
- hold time max time (sec) between receipt of
messages - BGP ID senders BGP identifier
- op len length (bytes) of optional parameters
- opt parameters - TLVs
40BGP UPDATE
path attributes (var)
WR len (2B)
withdrawn routes (var)
PA len (2B)
NLRI (var)
- Withdrawn Routes list of routes no longer to be
used (NLRI format- see below) - Path Attributes route specific information (see
next page) - Network Layer Reachability Information
(classless) routing information - the NLRI is a list of address-prefixes
- each prefix must be masked from the left to the
length specified
41BGP UPDATE - Path Attributes
- flags
- O optional/well-known bit
- if 1 must be recognized by all BGP
implementations - if W1 and unrecognized attribute, BGP sends
notification and session closed - T transitive/nontransitive bit
- if 1 and attribute unrecognized it is passed
along, else silently ignored - well-known attributes are always transitive
- C complete/partial bit (for optional transitive
attributes only) - L attribute length bit (0 attribute length is
1B, 1 length is 2B) - type code
- ORIGIN, AS_PATH, NEXT_HOP, MED, LOCAL_PREF,
- AGGREGATOR, COMMUNITY, ORIGINATOR_ID
42BGP NOTIFICATON
error code (1B)
error subcode (2B)
data (var)
- all notification messages cause BGP session to
close - error codes include
- message header error
- open message error
- update message error
- hold timer expired
- state machine error
- other fatal error
43LDP
header (10B)
version (2B)
length (2B)
LDP-ID (6B)
messages (variable)
- version presently 1
- length - PDU length, excluding version and length
fields - LDP-ID identifies label space of sending LDP
peer - LSR-ID(4B) globally unique LSR ID
- label space ID (2B) for per-port label spaces
- (zero for per-platform label spaces)
- messages zero or more TLVs (see next page)
44LDP messages
mandatory parameters (variable)
optional parameters (variable)
type (2B)
length (2B)
message-ID (4B)
- type
- U unknown message bit
- if message type unknown to receiver
- U0 receiver returns notification to sender
- U1 receiver silently ignores
- length - message length, excluding type and
length fields - Message-ID unique ID for message (for matching
with returned notification) - if there are mandatory parameters, they most
appear in a specific order - optional parameters may appear in any order
45LDP message types
- Hello (UDP, for discovery)
- Initialization (specifies LDP version, label
space range, parameters) - KeepAlive (heart beat)
- Notification (error, e.g.unsupported version,
unknown/malformed msg, timer expired) - Address (LSR advertises its interface IP
address(es) to peers) - Address Withdraw (LSR revokes previously
advertised interface IP address) - Label Mapping (downstream LSR advertisement of a
label mapping for a FEC ) - Label Withdraw (downstream LSR informing that
binding is revoked) - Label Request (upstream LSR request for binding
in downstream-on-demand mode) - Label Release (upstream LSR informing that
binding no longer needed) - Label Abort Request (upstream LSR asks to revoke
request before satisfied)
46LDP state machine
- LSR periodically transmits hello UDP messages
- multicast to all routers on subnet group
- targeted to preconfigured IP address
- LSRs listen on this UDP port for hello messages
- when LSR receives hello from another LSR
- it opens a TCP connection to that other LSR
- or (for extended discovery)
- it unicast transmits a hello back to the other
LSR - LSR with higher ID sends session initialization
message - other LSR LDP accepts (sends keepalive) or
rejects - informative or keepalive messages sent
3.2
47Provisioning VPLS
48Provisioning
- customers may want their SP to take an active
role - in managing their networks
- Provider Provisioned VPN (PPVPN) refers to VPN
- for which SP participates in management and
provisioning - by provisioning we mean (at least)
- setting up the ACs (often manual configuration)
- assigning global VPN-ID to VPN instances
- discovery of all PEs that participate in a VPN
instance - associating AC with VPN at PE
- providing PEs with information needed to set up
tunnels - configuring tunnels with necessary
characteristics
49Autodiscovery
- we have assumed that each PE knows
- which PEs participate in particular VPN instance
- manual configuration is problematic logistically
- autodiscovery refers to automatically finding all
PEs in a given VPN - each PE "discovers" other PEs by means of some
protocol - BGP (to be discussed later)
- RADIUS (Remote Authentication Dial In User
Service) - CE RADIUS users, PEs Network Access Servers
(NAS) - PE can authenticate CEs and find other PEs
- targeted LDP (Stokes draft now abandoned)
- advertise FEC in LDP
- new TLV in label mapping message contains VPN-id,
P or PE, capabilities
50PWE control
- a PW is a bidirectional entity (two LSPs in
opposite directions) - a PW connects two forwarders
- PW setup via targeted LDP signaling
- 2 different LDP TLVs can be used
- PWid FEC
- Generalized ID FEC
- PWid FEC
- to use both sides of PW provisioned with a unique
(32b) value - each of PW endpoint independently initiates LSP
set up - LSPs bound together into a single PW
51Generalized ID
- for each forwarder we have a PE-unique Attachment
Identifier (AI) - ltPE, AIgt must be globally unique
- frequently useful to group a set of forwarders
into a attachment group - where PWs may only be set up among members of a
group - then Attachment Identifier (AI) consists of
- Attachment Group Identifier (AGI) (which is
basically a VPN-id) - Attachment Individual Identifier (AII)
- the LSPs making up the PW are
- lt PE1, (AGI, AII1), PE2, (AGI, AII2) gt and
- lt PE2, (AGI, AII2), PE1, (AGI, AII1) gt
- we also need to define
- Source Attachment Identifier (SAI AGISAII)
- Target Attachment Identifier (TAI AGITAII)
- receiving PE can map TAI uniquely to AC
52VPWS Provisioning
- Double Sided Provisioning
- each AC provisioned with local name, remote PE
address, and remote name - during signaling, local name is sent as SAII,
remote name as TAII (AGI null) - to connect 2 ACs by a PW
- local name remote name(PWid FEC) or
- local name of each must be remote name of the
other - Single Sided Provisioning with Discovery
- each AC provisioned with local name (VPN-id) and
AII - during signaling, local name is sent as AGI
- to connect 2 ACs by a PW
- both must have the same VPN-id
- only one needs to be provisioned with remote name
(local name of other AC) - neither needs to be provisioned with the address
of the remote PE - during auto-discovery procedure
- each PE advertises its ltVPN-id, local AIIgt pairs
- each PE compares its local ltVPN-id, remote AIIgt
pairs - with ltVPN-id, local AIIgt pairs from other PEs
- if match then need to connect
- local name sent as SAII, remote AII sent as TAII,
VPN-id as AGI
53VPLS Provisioning
- every VPLS instance is assigned a unique VPN-id
- PEs are preconfigured or find each other using
auto-discovery - if PE detects VPN-id to which it belongs
- it sets up a PW
- during signaling
- VPN-id is send as the AGI field
- SAII and TAII are set to null
54LDP VPLS
- ex-Lasserre-VKompella draft, now
draft-ietf-l2vpn-vpls-ldp - authors Marc Lasserre - Riverstone and Vach
Kompella Alcatel - supported by Cisco, Nortel, Alcatel, Riverstone,
Extreme, Luminous, Corrigent, Hatteras, Overture,
RAD - use LDP for
- PW setup and tear-down signaling
- explicit withdrawal of MACs (force relearning)
- full mesh of targeted LDP sessions between
VPLS-enabled PEs - automatically establish a full mesh of Ethernet
PWs - participating PE sends an unsolicited label
mapping message - to every other PE, specifying VPN-ID
(preferably with generalized PWid FEC element) - if receiving PE accepts,
- it sends a label mapping message back
55BGP VPLS
- ex-Kompella draft, now draft-ietf-l2vpn-vpls
-bgp - authors Kireeti Kompella, Yakov Rekhter
Juniper - uses BGP4 (with multiprotocol extensions) for
- autodiscovery (uses Route Target extended
community as VPN-ID) - PW setup and tear-down (uses Network Layer
Reachability Information) - force MAC relearning (uses Relearn Sequence
Number TLV) - protocol essentially identical to RFC2547bis (to
be discussed later)
56BGP VPLS signaling
- define demultiplexor VPN-ID ingress PE
- VPLS Edge (VE) advertises VPLS NLRIs for each
VPLS instance - NLRI defines demultiplexors for all PEs in VPLS
instance - extended attribute encodes PE capabilities
- if new PE joins VPLS
- new NLRI seamlessly adds new label
- coalesce to a single NLRI with temporary service
disruption - PE sets up PW when it receives an NLRI for VPLS
- to leave VPLS instance PE withdraws NLRI
- remote PEs remove PWs
57Generalizations
58Distributed (Generic) VPLS
- L2VPN framework allows decomposition of PE
- User-Facing PE (U-PE) performs Bridge functions
- MAC learning, forwarding decisions
- Network-Facing PE (N-PE) performs VPLS functions
- establishes
tunnels, PWs - U-PE is inexpensive CLE, good for MTU
applications
59Hierarchical VPLS
MTU
MTU
MTU
- straight VPLS has a problem N2 PWs are used
- which means N2 LDP sessions, and N2 floods and
replications - to improve scalability, can use hub-and-spoke
topology - if VPLS is in multi-tenant buildings, local PE is
MTU - HVPLS PEs are full mesh, but do not perform
bridging - spoke PW set up between PE and MTU (note
end-point is virtual bridge)
60L3VPNs
61BGP MPLS VPNs (2547bis)
- presently most popular provider managed VPN
- originally specified in RFC 2547, update in draft
called 2547bis - transports IPv4 (IPv6) traffic in MPLS tunnels
- uses BGP for route distribution
- since SPs commonly use BGP for routing
- 2547 is not an overlay model
- CE routers at different sites are not routing
peers - they do not directly exchange routing information
- they dont even need to know of each other
- so customer neednt manage a backbone or virtual
backbone - no inter-site routing problems
62BGP MPLS VPNs (cont.)
- only PE routers maintain VPN information
- P routers neednt maintain any customer routing
information - C routes either manually configured in PE
- or advertised to PE using BGP, OSPF, etc.
- PE advertises routes to remote PEs using BGP
- remote PEs advertise routes to their CEs using
BGP, OSPF, etc. - IP address overlap solved using Route
Distinguisher (RD)
632547bis architecture
CE not peer to CE
CE peer to PE
CE
CE is IP router
- Virtual router (peering) model, not tunneling
- PE maintains Virtual Route Forwarding table for
each VPN - BGP (with multiprotocol extensions) used for
label distribution - in order to support private IP addresses
- PE prepends 8B Route Distinguisher (unique to
site) to IP address
64 L2VPN vs. L3VPN
- C switch connects to L2 circuits
- BGP or LDP
- all L3 traffic types
- only Ethernet L2
- Cs responsible for routing
- overlay model
- simple customer-SP interface
- C peering scales as VPN size
- scaling problem
- C router peers with SP router
- BGP
- limited to IP traffic
- supports different L2 technologies
- SP responsible for routing
- peer model
- complex customer-SP interface
- C peering independent of VPN size
- scales well