Title: Reflections on the Development of Active
1Reflections on the Development ofActive
Programmable Networks
- Ken Calvert
- University of Kentucky
- Laboratory for Advanced Networking
2Some Personal History
- First AN discussions 1995 w/Zegura,
Bhattacharjee - Initial skepticism
- Why would you want to do that?
- Excitement
- About the power of the concept
- About the prospect of developing a new Internet
- Resignation
- About the difficulty of convincing people of the
need - What we have seems to work fine?
- DARPA Active Nets (later FTN) program from June
1997 through June 2004 - CANEs project (Georgia Tech)
- ActiveCast (University of Kentucky/Georgia Tech)
- Architectural Framework, 1998
- Defined roles, interfaces for EEs, NodeOS
3What Active Networking is About(From a
presentation ca. 2000)
- Research Challenge
- What abstractions make up the programming
interface? - What node functionality is useful to
applications? - How can services be composed while preserving
correctness and performance? - How can the network be protected against
malicious or malfunctioning programs? - How can a programming interface scale to 108
users and nodes, 106 concurrent flows, 109
packets/second?
4Three Projects
- CANEs (1997-2002)
- Concast (1999-2003)
- Ephemeral State Processing (1999-present)
5CANEs Project Goals
- Prototype an EE supporting structured composition
of independently-implemented network-based
functionality - Modularity primitive elements composition
mechanism - Model Unix tools awk/grep/ls/cat/sort/...
pipes - Show benefits of user-controlled functionality in
the net - Bring application knowledge and network
knowledge together in space-time - Application-specific adaptation to congestion
- In-network caching
- Reliable multicast (with UMass/TASC)
- Mobility
6CANEs Project Goals
- Allow for fast forwarding of plain old vanilla
traffic - Generic Forwarding Function that could be
implemented in hardware - Reason formally about composition and resulting
global behavior - Establish correctness of the underlying fixed
functionality - Identify sufficient conditions for user-supplied
code to preserve that correctness
7CANEs Packet-Processing Model
Generic Forwarding Function
predefined slots
customizing code
(e.g. active congestion control)
outgoing channels
incoming channels
8Application Intelligent Discard for MPEG
- Principle P, B frames depend on I frames
- Frames spread over many packets
- GOP (typically) one I frame, few P frames, many
B frames - Discard approaches
- Discard application-layer units (e.g. Frames,
GOPs) - Static priorities (e.g., I frame higher than P,
B) - Drop P, B if corresponding I already dropped
- Evict P, B from queue to make room for I
- Evaluation metrics
- Application-layer quality (e.g., SNR, I-frames
received) - Network impact (e.g., Received bytes discarded)
9Experiment Configuration
Background traffic source
Active IP router
Bottleneck link (2 Mbps)
MPEG source (avg rate 725 kbps)
10Result I-frames Received
One active router, bottleneck 2Mbps, MPEG source
averages 725 Kbps
11Result Data Discarded at Receiver
12Result Frame-by-frame Behavior
13CANEs Lessons Learned
- Few applications need customized processing at
every hop - Capsule model is overkill
- Useful, powerful model system-supplied fixed
processing user-supplied variable processing - Fixed functionality can be hardened, optimized
- User-supplied functionality can be constrained
for safety - Eases burden of proving correctness
- Less general than language-based approaches
- Importance of timer-driven processing
- Importance of naming
- topologies
- reusable configurations of underlyinginjected
programs
14Three Projects
- CANEs (1997-2002)
- Concast (1999-2003)
- Ephemeral State Processing (1999-present)
15ActiveCast
- Scalability through anonymity
- Deploying active code should not require
- Explicit knowledge of topology
- Enumeration of specific sites
- ? hide details of finding, activating nodes
New Ideas
destany within 2km of pt. x, with capabilities
...
- Network service scalability through anonymity
- Deploying active code must not require
- Explicit knowledge of topology
- Enumeration of specific sites
- ? hide details of finding, activating nodes
- Concast N-to-1 service, dual of multicast
- Single address represents many senders
- Many sends ? one receive
- Anycast, Speccast
- Packets delivered to any/every node satisfying
user specification - Ephemeral State Processing
- Use small, fixed amount of stateshort, fixed
lifetime
Anycast
X
deploy
Concast
merge
Impact
Schedule
design specify concast, anycast APIs
prototype anycast implementation
- Enable application-friendly active networks by
packaging the power of programmable network
platform into easy-to-use, yet customizable
network services. - Manifold increase in efficiency/scalability of
applications by hiding details of group size in
all aspectsextend benefits of multicast to both
directions of transmission. - Applicable to wide range of many-many
applications sensor data collection/fusion,
routing/dissemination of real-time data.
prototype concast service
prototype net recon service
evaluate, refine concast spec.
June 1999
May 2002
2001
2000
anycast specification language
analyze design parameters for network recon
publish concast, anycast APIs
comparative analysis of anycast performance
University of Kentucky K. Calvert, J.
Griffioen Georgia Institute of Technology E.
Zegura
16Concast Motivating Problem
- Many multicast applications involve feedback
- Retransmission requests for reliability
- Loss rates for congestion avoidance
- Sending feedback via existing channels is ugly
- Sender deals with individual receivers,
destroying abstraction - Implosion limits scalability
- No Many-to-One channel exists!
17Our Solution Concast Network Service
- Scalability through abstraction
- Single identifier (concast group ID) represents
an arbitrary number of senders. - Benefits both receiver and network
- Multiple sends result in a single message
delivery - Trade additional processing in routers for
reduced bandwidth requirements
18Concast Semantics
- Conservative hardwire various merge semantics
into the network, user selects at flow setup time - Liberal user specifies merge computation to be
carried out by the network (intermediate systems) - E.g., by downloading Java bytecodes
- Challenge
- Allow customization of merge semantics
- Within a practically-implementable framework
- That limits resource consumption (and other
dangers)
19Concast Programming Interface
- Application-defined Merge Specification
comprises - getTag(IPDatagram) Tag
- Returns a tag extracted from the packet
- Packets p, p merged iff getTag(p)getTag(p)
- merge(MergeState, IPDatagram, FlowState)
- MergeState
- Updates state of the merge computation for
incoming packet - done(MergeState) boolean
- Returns true when ready to forward merged
packet - buildDatagram(MergeState) IPDatagram
- Constructs the packet to be forwarded from saved
state
20Generic Hop-by-Hop Processing
- ProcessDatagram(IPAddr R, ConcastGroupID G,
IPDatagram m) - FlowState fsb MergeState s Tag t
-
- fsb lookupFlow(R,G)
- if (fsb ? null)
- t fsb.getTag(m)
- s getMergeState(t, fsb)
- s updateTTL(s, m)
- s fsb.merge(s, m, fsb)
- if (fsb.done(s))
- (s, m) fsb.buildDatagram(s)
- forwardDG(fsb, s, m)
-
- putMergeState(fsb, s, t)
- // else drop quietly
21Ways to Use Concast
- Application-specific merging
- Filtering, aggregating telemetry
- Merging media streams (demonstrated audio,
video) - Application-independent merging protocols
- Collecting maximum (or any associative,
commutative operator) of group members sent
values - E.g. reliable multicast feedback
- Protocol-independent generic services
- Duplicate suppression (based on hash of IP
payload) - Aggregation of small packets (TCP acks) ICNP
2000
22Small-packet Aggregation The Problem
- Small packets require disproportionate resources
- There is always a fixed per-packet overhead
- Today forwarding lookups most expensive
- Router performance packets/second (not
bytes/sec) - Goal minimum-sized datagrams at wire speed
- Small packets are a significant fraction of
traffic - TCP acknowledgements 40 bytes
- CAIDA (1998 data)
- Half of all packets 50 bytes or less
- 60 of all packets 100 bytes or less
23Solution Idea
- Aggregate small packets traveling in the same
direction into larger packets - Delay small packets for aggregation
- Break up downstream (at destination) for ultimate
delivery - Benefits
- For network reduced switching load in some
places - Amortize one forwarding lookup over multiple
packets - For application
- Fewer lost acks ? better throughput
- Dangers
- For network increased processing load in some
places - For application reduced throughput under certain
conditions
24Multiplexing With Concast Senders
Applications
TCP
UDP
Mux
IP
Concast
Network Interface
25Multiplexing With Concast Routers
IP
Concast
Mux Merge
Network Interface
Network Interface
26Multiplexing With Concast Receiver
TCP
UDP
Demux
IP
Concast
Network Interface
27Multiplex Packet Structure
IP Header
Initial Time-to-Live Max Total Delay
Allowed Amount Delayed So Far Max Per-hop Delay
Multiplex Header
Payload 0
Source Address Original TTL Protocol Payload
Length
Payload 1
...
28Router Processing Context
Multiplex packets
y
x
w
k
Concast Processing
x
z
Non-concast packets
k
w
Holding area for delayed multiplex packets
29Evaluating Effectivenessns2 Simulation Study
- Example application Web Server
- Many simultaneous TCP connections
- Multiplex TCP acks only
- Simulated workload
- 4KB web page transmitted to 200 clients
- Two traffic scenarios
- Low-loss minimal UDP cross traffic
- High-loss add 40 TCP flows cross traffic
- All senders specify same Max Local Delay
- Effectiveness Metrics
- Total throughput of all connections
- Fraction of aggregated acks
30Expected Behavior
- Increasing delay ? increased aggregation
- Decreased loss due to packet-oriented queue
sizes - Increased throughput
- Increasing delay increases RTT
- Longer slow-start
- Longer completion time
- Decreased throughput
- Aggregation more effective when queues are full
31High-Loss Scenario
32Low-loss Scenario
33Concast Partial Deployment Benefits
- Edges of the network generally have greater
compute/transmission bandwidth ratio - Deploy concast at domain egress nodes
S
S
S
S
S
S
S
S
S
R
34Partial Deployment Effectiveness
- Number of Packet-Hops To Collect a Value
- From Every Group Member
- 4900-node Transit-stub Graphs
35Concast Lessons Learned
- Partial deployment (at domain boundaries) can
provide substantial benefits - Fixedvariable framework provides a defensible
programming interface - Trust is a potential show-stopper
- Problem setting up concast sessions across
multiple service providers - Mutual distrust
- Providers want only paying customers to get this
premium service - Users want only trustworthy providers nodes
handling their packets - Anonymity means users have to rely on providers
to enforce their policies - Problem-specific solutions are easier to sell
than generic platforms - Experience with IETF Reliable Multicast Transport
working group
36Three Projects
- CANEs (1997-2002)
- Concast (1999-2003)
- Ephemeral State Processing (1999-present)
37The Building-Block Approach to Extending Router
Functionality
- Internet Protocol philosophy
- Keep router functions simple
- Push responsibility for constructing end-to-end
services to End Systems - Building block function should be
- Flexible Applicable to more than one kind of
problem - ...including presently unknown problems
- Useful Deployable today
- ... to solve (or assist in solving) one or more
real problems - Scalable IP-like (i.e. bounded) resource
requirements - ...that can be handled on/near the fast path in
hardware
38User-controlled State
- Conventional wisdom Not scalable
- Too expensive to provide for 100K flows
- Overhead setup, soft-state refresh, garbage
collection - Limiting factors
- Time-space product of memory usage (per flow)
- Management overhead (including trust)
- Solution Ephemeral State
- A fixed-lifetime, associative store
- Store and retrieve fixed-size values
- Identified by fixed-size, randomly-chosen tags
- Bindings persist for a fixed time then vanish
- No management overhead
39Ephemeral State Store
- Set of pairs of natural numbers (t,v)
- At most one pair in the set for any value of t
- Access functions
- put(t, v) establishes the set contains (tag,
value) - get(t) if ? v such that (t,v) is in the set,
returns v - else returns null
returns 1
returns 4
40ESP Ephemeral State Processing
- Ephemeral State Store (ESS)
- Associative memory set of (64-bit tag, 64-bit
value) pairs - One ESS per processing context
- Packet-borne instructions (one per packet)
- Each instruction defines a fixed-length
computation - Operands values in ESS, packet fields,
node-specific values - Comparable to machine instructions of
general-purpose computer - On termination, forward or discard packet
- Routers support a common instruction set
- Wire protocol
- ESP instructions carried in payload or shim
header - Packets recognized and executed hop-by-hop
- End-to-End services
- Construct by sequencing packets in time and space
41Example ESP Instruction
COUNT p packet p.C tag carried in ESP
hdr p.thresh immediate value carried in ESP hdr
? get(p.C) if (? is null) ? 0 ? ?
1 put(p.C, ?) if (? ? p.thresh) forward p else
discard p
- Effect bind tag C to count of packets processed
forward ones for which count lt p.thresh on arrival
42Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Normal IP Output Processing
43Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Input Context
Normal IP Output Processing
44Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Input Context
Normal IP Output Processing
Both Contexts
45Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Input Context
Normal IP Output Processing
Both Contexts
Central Context
46Uses for ESP
- Controlling packet flow
- Make drop/forward decisions based on state at
node or i/f - Example Duplicate suppression
- Identifying interior nodes with specific
properties - Reveal just enough of topology to find what is
needed - Example Finding multicast branch points for
unicast-based service - Processing user data
- Simple hierarchical computations scale better
- Example Aggregating feedback from a multicast
group (a la Concast)
47Network-Processor Implementation
Network Processor
Off-the-shelf Router
Network Processor
- Add ESP to an existing router
- Non-ESP packets pass straight through
48Example Application
Off-the-shelf Router
MAC
ESP
ESP
- Add ESP to an existing router
- Non-ESP packets pass straight through
49Example Application
Off-the-shelf Router
MAC
ESP
ESP
- Add ESP to an existing router
- Non-ESP packets pass straight through
- ESP packets diverted for processing
50Implementing ESP on the IXP 2400 Performance
One MicroEngine (out of 8), 600 MHz Stream
COUNT Instructions Input rate 954 Mbps F
fraction of packets creating new tag bindings
Number of Threads Throughput F 7 Throughput F 28
1 383 Mbps 347 Mbps
2 662 Mbps 605 Mbps
4 954 MBps 871 MBps
8 954 MBps 954 MBps
51Implementing ESP on the IXP 2400 Performance
52ESP Lessons Learned
- Simplicity is the key to scalability
- Lightweight functionality enables port-based
processing, parallelism - Too cheap to meter eliminates trust issues
- (cf. IPs best-effort datagram service)
- Memory bandwidth is a constant bottleneck
53Some Final Observations
- Our focus has moved closer to the hardware over
time - That is A Good Thing!
- Finding truly compelling example applications is
very hard - Note I do not refer to a killer app
- I am not convinced that sensor networks change
this observation - Getting researchers to agree on a vision is even
harder - Every new network service should be designed with
a business model (and corresponding trust model)
in mind - Exploring the concept of programmable networks
has been an extremely interesting rewarding
endeavor
54Acknowledgement
- Work described in this talk was done in
collaboration with colleagues and students, whose
contributions are gratefully acknowledged - Ellen Zegura, Jim Griffioen, James Sterbenz,
Bobby Bhattacharjee, Wen Su, Billy Mullins, Amit
Sehgal, Leonid Poutievski, Srinivasan Venkatramen - Along with the support of DARPA, Intel
Corporation, Cisco Systems, and the US National
Science Foundation