Title: Control
1Control
- Fred Kuhns
- fredk_at_arl.wustl.edu
- Applied Research laboratory
- Department of Computer Science and Engineering
- Washington University in St. Louis
2Virtual Networking Basic Concepts
Substrate Links interconnect adjacent Substrate
Routers
Substrate Router
One or more Meta Router instances
Meta Links interconnect adjacent Meta Routers.
Defined within substrate link context
substrate links may be Tunneled within existing
networks IP, MPLS, etc.
3Adding a Node
Install new substrate router
Define meta-links between meta nodes (routers or
hosts)
Create substrate links between peers
Instantiate meta router(s)
4System Components
- General purpose processing engines (PE/GP).
- Shared PlanetLab VM environment.
- Local Planetlab node manager to configure and
manager VMs - vserver, vnet may change to support substrate
functions - Implement substrate functions in kernel
- rate control, mux/demux, substrate header
processing - Dedicated no local substrate functions
- May choose to implement substrate header
processing and rate control. - Substrate uses VLANs to ensure isolation (VLAN
MRid) - Can use 802.1Q priorities to isolate traffic
further. - NP blades (PE/NP).
- Shared user supplies parse and header formatting
code. - Dedicated User has full access to and control
over the hardware device - General Meta-Processing Engine (MPE) notes
- Use loopback to enforce rate limits between
dedicated MPEs - Legacy node modeled as dedicated MPE, use
loopback blade to remove/add substrate headers. - Substrate links Interconnect substrate nodes
- Meta-links defined within their context.
- Assume an external entity configures end-to-end
meta-nets and meta-links
5Switch
- Switch Blade Specs
- Promentum ATCA-2210
- http//www.radisys.com/products/ds-page.cfm?produc
tdatasheetsid1191 - 20-port 10GE fabric switch
- 14 10GE links to user slots
- 4 10GE links for external connections (up/cross
links) on front panel - 24-port 1GE Base switch
- 14 1GE links to users lots
- 1GE link to redundant switch blade
- 1 10GE and 4 1GE links for external connections
(up/cross links) on front panel - Wire-speed L2 and L3 switching
- 4K IEEE 802.1Q VLANs
- Etc
- Traversing the Switch
- Switching is based on Ethernet Destination
Address - Isolation is based on VLAN.
- One VLAN will be assigned to each MetaNet present
on a Substrate Router. - All switch traffic for a MetaNet will be required
to use its assigned VLAN. - Frames from a MetaNet will only be transmitted to
a port which is allowed to receive the specified
VLAN.
6Packet Processing
- Key features
- 16 32 bit 1.4 GHz Micro-engines
- peak instruction rate gt20 GIPs
- 8 hw contexts per processor
- support gt50 i/byte (input output)
- pipeline connections for streaming
- four QDR SRAM interfaces and three RDRAM
interfaces - high IO bandwidth (up to 20G)
- Xscale control processor
- encryption/decryption engine
7System Architecture
- General purpose blades.
- shared blades run Plab OS
- no change to current apps
- also support dedicated blades
- use separate blade server to preserve ATCA slots
for NPs - NP blades.
- support dedicated PEs
- control from Vserver on PE/GP
- shared PE options
- shared NP for fast path
- shared NP with plugins
- 10 GE fabric switch
- VLANs used to isolate metarouters
- uplinks for connecting to multiple chasses
- Good ratio of PEs to LC 31
compute blade with disk
Radisys7010
Radisys 7010 with RTM
up to 10 1GEinterfaces
Line Card
PE/GP
PE/NP
. . .
. . .
10 GE Switch
Switch Blade
1 GE for control 10 Gb/s for data
8Block Diagram of a Meta-Router
Control/Management using Base channel (Control
Net IPv4)
Meta Interfaces (MI) MI connected to meta-links
1G
1G
.5G
2G
1G
.5G
0
1
2
3
4
5
MPEk1
MPEk2
MPEk3
control
data path
data path
.1G
.1G
3G
3G
.1G
.1G
MPEs interconnected in data plane by a
meta-switch. Packet includes Meta-Router and
Meta-PE identifier
Some Substrate detected errors or events reported
to Meta-Router control MPE.
Meta Switch
Meta-Router
Meta-Processing Engines (MPE) - virtual
machine, COTS PC, NPU, FPGA - PEs differ in ease
of programming and performance - MR may use
one or more PEs, with possibly different types
The first Meta-Processing Engine (MPE) assigned
to Meta-Network MNetk called MPEk1
9System Block Diagram
RTM
RTM
10 x 1GbE
PE/NP
PE/NP
PE/GP
LC
LC
PE/GP
PCI
GP CPU
xscale
xscale
xscale
xscale
NPU-A
NPU-B
NPU-A
NPU-B
TCAM
2x1GE
GbE interface
2x1GE
X
X
Fabric Ethernet Switch (10Gbps, data path)
Base Ethernet Switch (1Gbps, control)
I2C (IPMI)
map VLANX to VLANY
Node Server
Loopback
user login accounts
Node Manager
Shelf manager
10Top-Level View (exported) of the Node
PE/GP (control, IPaddr) (platform, x86) (type,
linux_vserver)
PE/NP (control, IPaddr) (platform,
IXP2800) (type, IXP_SHARED)
S-Link (type, p2p) (peer, _Desc_) (BW, XGbps)
PE/GP (control, IPaddr) (platform, x86) (type,
dedicated)
PE/NP (control, IPaddr) (platform,
IXP2800) (type, IXP_DEDICATED)
S-Link (type, p2p) (peer, XXX) (BW, XXGbps)
Exported Node Resource List (Processing engines,
Substrate Links)
Node Server
Substrate Control
user login accounts
Node Manager
11Substrate Enabling an MR
Allocate control-plane MPE (required)
Meta-Router MR1 for MNetk
Update host with local Net gateway
Allocate data-plane MPEs
Host (located within node)
Enable VLANk on fabric switch ports
PE
PE
PE
3
2
1
0
local
Enable control over Base switch (IP-based)
4
10GbE (fabric)
loopback
6
5
7
Update shared MPEs for MI and inter-MPE traffic
LC
LC
Line card
Substrate
Use loopback to define interfaces internal to the
system node.
Define Meta-Interface mappings
12Block Diagram
map received packet to MR and MI
Each MRMI pair is assigned its own rate
controlled queue
Line Card
Line Card
Lookup table
Shared PE
map to MRMI
MR1
MR2
MR5MI1
Dedicated PE
MR3
Line Card
Line Card
Fabric Switch
Fabric Switch
Shared PE/NP
MR4
MR5
1
1
2
2
Meta-Interfaces are rate controlled
Shared PE/GP
VMM
VM manager
Node Server
meta-router
Meta-net control and management functions
(configure, stats, routing etc). Communicate with
MR over separate base switch.
Internet
Node M.
VMM?
meta-net5 control
Base switch (control)
slice/MN VMs?
App-level service
13Partitioning the Control plane
- Substrate manager
- Initialization discover system HW components and
capabilities (blades, links etc) - Hides low level implementation details
- Interacts with shelf manager for resetting boards
or detecting failures. - Node manager
- Initialization request system resource list
- Operational Allocate resources to meta-Networks
(slice authorities?) - Request substrate to reset MPEs
- Substrate assumptions
- All MNets (slices) with a locally defined
meta-router/service (sliver) have a control
process to which it can send exception packets
and event notifications. - Communication
- out-of-band uses Base interface and internal IP
addresses - in-band uses data plane and MPE id.
- Notifications
- ARP errors, Improperly formatted frame, Interface
down/up, etc. - If meta-link is a pass-through link then the Node
manager is responsible for handling meta-net
level errors/event notification. For example link
goes down.
14Initialization Substrate Resource Discovery
- Creates list of devices and their Ethernet
Addresses - Network Processor (NP) blades
- Type network-processor, Arch ixp2800, Memory
768MB (DRAM), Disk 0, Rate 5Gbps - General Processor (GP) blades
- Type linux-vserver, Arch X, Memory X, Disk X,
Rate X - Line Card blades
- not exposed to node manager, used to implement
meta-interfaces - another entity creates substrate links to
interconnect peer substrate nodes. - create table mapping line card blades, physical
links and Ethernet addresses. - Internal representation
- Substrate device ID ltID, SDidgt
- If device has a local control daemon ltControl,
IP Addressgt - Type Processing Engine (NP/GP)
- ltPlatform, (Dual IXP2800Xeon???)gt, ltMemory, gt,
ltStorage, gt ltClock, (1.4GHz???)gt ltFabric,
10GbEgt, ltBase, 1GbEgt, ??? - Type Line Card
- ltPlatform, Dual IXP2800gt ltPorts, ltMedia,
Ethernetgt, ltRate, 1Gbpsgtgt, ??? - Substrate Links
- ltType, p2pgt, ltPeer, Ethernet Addressgt, ltRate
Limitgt, - Met-Link list ltMLid, MLIgt, ltMR, MRidgt,
15Initialization Exported Resource Model
- List of available elements
- Attributes of interest?
- Platform IXP2800, PowerPC, ARM, x86 Memory
DRAM/SRAM Disk XGB Bandwidth 5Gbps VM_Type
linux-vserver, IXP_Shared, IXP_Dedicated,
G__Dedicated Special TCAM - network-processor NP-Shared, NP-Dedicated
- General purpose GP-Shared (linux-vserver),
GP-Dedicated - Each element is assigned an IP address for
control (internal control LAN) - List of available substrate links
- Access networks (expect Ethernet LAN interface)
substrate link is multi-access - Attributes Access multi-access, Available
Bandwidth, Legacy protocol(s) (i.e. IP), Link
protocol (i.e. Ethernet), Substrate ARP
implementation. - Core interface assume point-to-point, Bandwidth
controlled - Attributes Access Substrate Bandwidth, Legacy
protocol?
16Instantiate a router Register MNet
- Substrate assumptions
- All MNets (slices) with a locally defined
meta-router/service (sliver) will have defined a
control process to which it can send exception
packets and event notifications. - Communication out-of-band uses Base interface
and internal IP addresses, in band uses data
plane. ??? - Notifications ARP errors, Improperly formatted
frame, Interface down/up, etc. - If meta-link is a pass-through link then the Node
manager is responsible for handling errors/event
notification. - Node manager Actions
- Request binding of MNidk to allocated device (use
SDid from initialization) - Substrate enables VLANk on applicable ports of
the fabric switch - Allocate hardware resources (see following
discussion for different scenarios) - If control module already instantiated then
notify it of the MR location (IP address of
control interface). - If creating control entity then register it with
any line cards with meta-router interfaces (for
exception traffic). ???
17Instantiate a router Register Meta-Router (MR)
- Define MR specific Meta-Processing Engines (MPE)
- Register MR ID MRidk with substrate
- substrate allocates VLANk and binds to MRidk,
- Request Meta-Processing Engines
- shared or dedicated, NP or GP, if shared then
relative allocation (rspec) - shared implies internal implementation has
support for substrate functions - dedicated w/substrate user implements substrate
functions. - dedicated no/substrate implies substrate will
remove any substrate headers from data packets
before delivering to MPE. For legacy systems. - indicate of this MPE is to receive control events
from substrate (Control_MPE). - substrate returns MPE id (MPid) and control IP
(MPip) address for each allocated MPE - substrate internally records Ethernet address of
MPE and enables VLAN on applicable port - substrate assumes that any MPE may send data
traffic to any other MPE - MPE specifies target MPE rather then MI when
sending packet.
18Instantiate a router Register Meta-Router (MR)
- Create meta-interfaces (with BW constraints)
- create meta-interfaces associated with external
substrate links - request meta-interface id (MIid) be bound to
substrate link x (SLx). - we need to work out the details of how a SL is
specified - We need to work out the details of who assigns
inbound versus outbound meta-link identifiers
(when they are used). If downstream node then the
some entity (node manager?) reports the outgoing
label. This node assigns the inbound label. - multi-access substrate/meta link node manager or
meta-router control entity must configure
meta-interface for ARP. Set local meta-address
and send destination address with output data
packet. - substrate updates tables to bind MI to
receiving MPE (i.e. were substrate sends
received packets) - create meta-interfaces for delivery to internal
devices (for example, legacy Planetlab nodes) - create meta-interface associated with an MPE
(i.e. the endsystem)
19Line Cards Assumptions
- Initially use a simplified model
- Core interfaces has point-to-point substrate
links which correspond (physically or logically)
to physical links. - LAN interfaces only support legacy IP traffic
20Scenarios
- Shared PE/NP, send request to device controller
on the XScale - Allocate memory for MR Control Block
- Allocate microengine and load MR code for Parser
and Header Formatter - Allocate meta-interfaces (output queues) and
assign Bandwidth constraints - Dedicated PE/NP
- Notify device control daemon that it will be a
dedicated device. May require loading/booting a
different image? - Shared GP
- use existing/new PlanetLab framework
- Dedicated GP
- legacy planetlab node
- other
21IPv4
- Create the default IPv4 Meta-Router, initially in
the non-forwarding state. - Register MetaNet output Meta-Net ID MNid
- Instantiate IPv4 router output Meta-Router ID
MRid - Add interfaces for legacy IPv4 traffic
- Substrate supports defining a default protocol
handler (Meta-Router) for non-substrate traffic. - for protocolIPv4, send to IPv4 meta-router
(specify the corresponding MPE).
22General Control/Management
- Meta routers use Base channel to send requests to
control entity on associated MPE devices - Node manager sends requests to central substrate
manager (xml-rpc?) - request to both configure, start/stop and tear
down meta-routers (MPEs and MIs). - Substrate enforces isolation and
policies/monitors meta-router sending rates. - Rate exceeded error If MPE violates rate limits
then its interface is disabled and the control
MPE is notified (over Base channel).. - Shared NP
- xscale daemon
- requests start/stop forwarding Allocate shared
memory for table Get/set statistic counters
Set/alter MR control lock Add/Remove lookup
table entries. - Lookup entries can be added to send data packets
to control MPE, packet header may contain tag to
indicate reason packet was sent - mechanism for allocating space for MR specific
code segments. - dedicated NP
- MPE controls XScale. When XScale boots a control
daemon si told to load a specific image
containing user code.
23ARP for Access Networks
- The substrate offers an ARP service to
meta-routers - Meta-router responsibilities
- before enabling interface must register its
meta-network address associated with
meta-interface - send destination (next-hop) meta-net address with
packets (part of substrate internal header).
Substrate will use arp with this value. - if meta-router wants to use multicast or
broadcast address then it mus also supply the
Link layer destination address. So the substrate
must also export the Link layer type. - substrate responsibilities
- all substrate nodes on an access network must
agree on meta-net identifiers (MLIs) - Issues ARP requests/responses using supplied
meta-net addresses and met-net id (MLI). - maintain ARP table and timeout entries according
to relevant rfcs. - ARP Failed error If ARP fails for a supplied
address then substrate must send packet (or
packet context) to control MPE of meta-router.