Title: Endsystem Support for Network Virtualization
1Endsystem Support for Network Virtualization
2Overview
- Context
- Endsystem networking model
- Protocol instances user or kernel space
- pros and cons
- explore user space protocols
- propose kernel level model
3Context Virtual (Diversified) Networking
substrate link
substrate router
virtual router
virtual link
virtual end-system
4Simulates Star Topology for Substrate Links
VLANX1
VLANX2
VLANXN
- Internetworking over a diversified network
- Ethernet example
- VLANs are used to provide the equivalent of a
virtualized wire connecting an endsystem to a
specific access router. - All vnets on an endsystem share common VLAN
- Use priority queuing (802.1P/Q) to isolate vnet
traffic. - Use admission control (static or dynamic) to
provide bandwidth guarantees to vnet traffic. - Substrate layer on endsystems enforce per VLAN
and per vnet bandwidth constraints
vNetX VR1
- Each host to substrate router connection is
assigned a distinct VLAN. So N hosts implies N
VLANs on ethernet. - Alternative is to define one VLAN tree for each
protocol suite (i.e. vnet).
5vnetX traffic uses high priority queues
Ethernet Hub with High and Low Priority TX queues
6Substrate Link as a VLAN Tree
VLANX
- One VLAN is used for all virtual net traffic
to/from a substrate router.
7Multiple Substrate Links
- Three VLANs are used for all virtual net traffic
to/from a substrate router. - Corresponds to 3 substrate links
- Low priority default for best-effort traffic
- Medium priority for virtual nets with soft
performance requirements (average bandwidth) - High priority for isochronous or low-delay,
interactive applications
VLANdgram
VLANhigh
VLANmed
8Multiple vNets per Host
ether addr/vlan
ether addr/vlan
ether addr/vlan
vlan 1
vlan 2
vlan 3
- Substrate link serves to connect an endsystem to
a substrate router. Virtualization of a physical
cable or wire. A packet enters one end, exists
the other and is opaque within. Simplex or
Duplex? - Substrate interface (need better term?)
endsystem abstraction representing a substrate
link. - Ethernet ltinterface, VLAN, destgt.
- Could be an IP tunnel
- Not required to be point-to-point.
- Virtual link represents the logical
interconnection of adjacent network nodes for a
given protocol suite. - Point-to-point. Simplex or Duplex?
- Virtual interface endsystem abstraction
representing one end of a virtual link. Substrate
defines mechanism for multiplexing onto common
substrate link. For example a virtual link
identifier (VLI) in a substrate header. Simplex
or Duplex?
ethernet LAN
filter on ethernet address and vlan
membership for substrate router
9Multiple next hop VRs?
Host A on vnetX
vNetX VR2
vNetX VR3
VLANXA2
VLANXA3
ethernet switched LAN
- Not a fundamental part of the model but it is
consistent with the current model used for TCP/IP
in endsystem. - Allows us to implement TCP/IP as a virtual net
protocol and not change the basic model
VLANXA1
vNetX VR1
10TCP/IP as an Example Protocol
IP Route Table
vint0 (eth0 VLANX) LL Info SR1 addr VLI
standard ethernet Interface
ethernet device
direct connect
VLANX
ethernet LAN
Substrate Interface Ethernet interface.
Destination address by ARP. Directly connected
destination IP address ARP enet
addr Gateway (Gateways IP ARP enet addr)
VLAN Virtual Interface Directly connected
Not used, model only for internetworking Gateway
VLI assigned by substrate.
ethernet dest. addr
Substrate Router SR1
IP
11OS Kernel Block Diagram
User Space (Applications)
Socket Interface
ops
AST Processing
callback
routes
IP
SW int (AST)
task management
util
TCP
TC/ AST
qdisc
poll
scheduler
callout Q
hardware independent layer
clock handler
process accounting scheduling time management
Device independent I/O
ethernet
Interrupt Processing
hardware dependent layer
configuration registers, MMU (TLB, cache, VM)
bus and peripherals System Exception handlers
eth0
uart
timer
OS ISR demux
Hardware
HW interrupt/Exception
12User or kernel Space protocols?
- Each has pros and cons
- User space protocols
- easier to implement and debug
- easier to introduce new protocols (not tightly
dependent on socket layer knowing about the new
protocol) - easier to isolate and protect protocols and apps
from each other (leverage process model) - kernel level protocols
- easier to integrate into existing framework
(simplifies support for system interface
functions like select/poll) - simplifies intra-protocol security and protection
(since protocol runs within trusted kernel) - simplifies kernel demultiplexing to correct
protocol context (endpoint) - increased efficiency
13User Space Protocol Implementation
- Uncommon outside of high-performance community,
they want zero-copy and specialized demux keys. - Problems asynchronous processing, life cycle,
authentication and demiultiplexing to endpoints - latency in delivering packets (i.e. acks) to user
space - increased overhead in per packet processing
before a drop/keep decision is made - processing received acks
- timeouts and retransmissions
- establishing connections and security snooping,
masquerading - supporting select and poll
- protocols where connection may outlive process
(TCPs TIMED_WAIT) - global routing and address resolution tables
- global connection tables
- need to know what other ports are being used
(locally) - accepting/rejecting new connections
14Assumptions
- Assumptions
- Applications using different VNs (or no VN) will
need to communicate using the various IPC
mechanisms - We want to manage all aspects of Network I/O but
not the use of other traditional resources
(memory, files etc) - CPU, memory and interface bandwidth controlled
at the virtual net granularity - intra-VN, implementers should have the mechanisms
to support QoS and Security - simple mechanism for adding new protocols/VNs
15User Space Protocols
- Chandramohan A. Thekkath , Thu D. Nguyen , Evelyn
Moy , Edward D. Lazowska, Implementing network
protocols at user level, IEEE/ACM Transactions on
Networking (TON), v.1 n.5, p.554-565, Oct. 1993 - Chris Maeda, Brian Bershad, Protocol Service
Decomposition for High-Performance Networking,
Proceedings of the 14th ACM Symposium on
Operating Systems Principles. December 1993, pp.
244-255. - Aled Edwards , Steve Muir, Experiences
implementing a high performance TCP in
user-space, Proceedings of the conference on
Applications, technologies, architectures, and
protocols for computer communication, p.196-205,
1995 - Kieran Mansley, Engineering a User-Level TCP for
the CLAN Network, Proceedings of the ACM SIGCOMM
workshop on Network-I/O convergence experience,
lessons, implications, Pages 228 236, 2003
16user-space protocols Global Issues
- Routing Direct packets to/from correct
endpoint/interface - How is traffic demultiplexed and sent to the
correct endpoint/process? - In-kernel filters
- Where are the routing tables and how are they
maintained? - route fixed when connection established or
located in shared memory - Control I use IPv4 as an example
- Address resolution protocols/tables?
- Other control protocols. For example ICMP, IGRP,
others? - Where are the routing protocols implemented?
- Management
- Must manage a protocols namespace (for example,
port numbers in IPv4). - Common programming technique, allow protocol
instance to select local address part - specify port 0 and addr 0 then implementation
will assign correct values - Passive connect model?
- In IPv4 a server listens on a port
(hostportproto) for a connection request. To
establish a connection a unique (to the
endsystem) port number is assigned and new socket
allocated. - socket-oriented system calls must be supported.
On UNIX must support non-blocking I/O with select
and poll. - Connection lifetime may outlast process.
- For example TCP TIME_WAIT or simply waiting for a
final ack or resending if no ack received. - Security we must provide sufficient mechanisms
for protocol developers
17User Space Configurations
- Given these global issues there are two likely
configurations - all traffic passes through common protocol daemon
in user space - control daemon implements basic set of control
functions while user library implements majority
of data path functions - prior work has shown the latter approach to be
superior. - Having all traffic pass through a common protocol
daemon gt at least one extra copy operation
(kernel -gt daemon -gt user process) - A better solution is for a daemon to insert
relatively simple packet filters in kernel for
established connections which directs packets
to/filters packets from endpoints.
18User-Space Passive Open
0. listen/accept (passive open)
vnetX control daemon (namespace, lifecycle,
connections)
4. new connection
data copy
socket layer
3. insert incoming and outgoing filters for vnetX
connection
1. connection request (in)
5. data, established connections
compare against connection specific outgoing
filter
2. ack (out)
vnet demux
connection filters
use VLI to access incoming filters and use to
demux to filter set and/or socket.
ethernet
19User-Space Active Open
0. connect
vnetX control daemon (namespace, lifecycle,
connections)
4. new connection
data copy
socket layer
1. connection request (out)
3. insert incoming and outgoing filters for vnetX
connection
5. data, established connections
compare against connection specific outgoing
filter
2. ack (in)
vnet demux
connection filters
use VLI to access incoming filters and use to
demux to filter set and/or socket.
ethernet
20User-Space Datagram (Connectionless)
daemon fills in local address and binds to
socket. No restrictions on destination
0. open(any)
vnetX control daemon (namespace, lifecycle,
connections)
data copy
2. new connection (local address)
socket layer
1. insert incoming and outgoing filters for vnetX
connection
3. data established connections
compare against connection specific outgoing
filter
vnet demux
connection filters
use VLI to access incoming filters and use to
demux to socket. In this case only the local part
is used.
ethernet
21User-Space Datagram (Connectionless)
daemon fills in both local and destination
addresses. Destination restricted
0. open(local and remote addr)
vnetX control daemon (namespace, lifecycle,
connections)
2. new connection(local and remote)
data copy
socket layer
1. insert incoming and outgoing filters for vnetX
connection
3. data established connections
compare against connection specific outgoing
filter
vnet demux
connection filters
ethernet
use VLI to access incoming filters and use to
demux to socket.
22User-Space App exits
TCP enters TIME_WAIT after close
vnetX control daemon (namespace, lifecycle,
connections)
socket layer
3. remove filters
1. connection close (out)
2. ack (in/out)
vnet demux
connection filters
ethernet
drop
23Extensible protocol frameworks in the kernel
- Herbert Bos, Bart Samwel, Safe Kernel Programming
in the OKE, Proceedings of the fifth IEEE
Conference on Open Architectures and Network
Programming, June 2002
24OKE
- Context For performance reasons it is useful to
permit third parties to load optimized modules
into the kernel - Problem Third party code is untrusted so loading
into kernel will compromise system security and
reliability. Could use safe execution environment
like java but incurs expensive runtime checks. - Solution create set of mechanisms and policies
to permit non-root users to safely load untrusted
application modules into kernel space with
minimal impact on runtime performance. - Safety use a trusted compile to enforce policies
(constraints). The constraints are designed to
ensure the untrusted module will not adversely
affect the kernel (core and loadable modules) or
unrelated processes. - User privileges Vary enforced constraints based
on user privileges (customizable language) - Termination well defined termination boundaries
to protect system state - Enforcement Static and dynamic checks language
extensions - Ease of use Familiar development environment
using Cyclone (type safe, C extension) and kernel
module. - Contribution definition of safe kernel
programming environment that meets competing
needs - performance
- safety
- ease of use
- hosted in a commodity OS
25Considerations
- Identified areas where modules may impact system
behavior - program correctness language restrictions for
safety and enforce coding conventions - Memory access static and dynamic enforcement of
memory access rules - Kernel module access static and dynamic
enforcement of kernel module (interface) access
restrictions - Resource usage Bounded (deterministic or
limited)
26Pushing protocols into the Kernel
- Positives
- All the issues associated with user-space
protocol simply go away. Global tables and
lifetime of the kernel - Performance, efficiency, existing code base
- Enhances intra-Protocol security
- Simplifies integration with existing network I/O
subsystems and interfaces - Negatives
- Isolation More difficult to isolate system from
protocol instances. Inter-protocol isolation
difficult. - Security Proving trust/security more difficult
- Implementation and debugging more difficult in
kernel
27Kernel-Space Protocols
Rework!
Application(s)
/dev/protoX /dev/vnet
User Space (Applications)
udpport
tcpport
rawIP
vnetep
vnetep
TCP
vnet
UDP
RAW IP
TCPn
TCP2
TCP1
TCP/IP
IP
route to interface
routes
SW Interrupt
HW Interrupt
Hardware
HW interrupt/Exception