Fast Incremental Updates for Pipelined Forwarding Engines - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Fast Incremental Updates for Pipelined Forwarding Engines

Description:

Memory locations that are modified must be limited in number and balanced across ... Software-based state trie can store the information. Eliminating excess writes ... – PowerPoint PPT presentation

Number of Views:39

Avg rating:3.0/5.0

Slides: 33

Provided by: cialCsie

Category:

more less

Transcript and Presenter's Notes

Title: Fast Incremental Updates for Pipelined Forwarding Engines

1
Fast Incremental Updates for Pipelined Forwarding
Engines

Author Anindya Basu, Girija Narlikar
Publisher Transaction on networking 05
Reporter Yen Cheng Liu
Date 11/30

2
Outline

Introduction
Background
Solving pipeline architecture problem
Route update characteristics
Memory optimization
Reduce bubbles

3
Introduction

The paper focus on ASIC-based packet forwarding
engine that utilize pipelining
Main issues of update
Memory allocated must be balanced across stages
Memory locations that are modified must be
limited in number and balanced across stages

4
Introduction

Main contribution of the paper
Present an algorithm to build a trie which has
balanced stage memory allocation
Develop multiple optimization which aimed at
reducing number of modification in each stage due
to route update
Software-based scheme to process update( similar
to shadow trie )
Flexible
Cost effective

5
Background

Leaf pushed trie

6
Pipelined Lookups Using Tries

Each trie level is stored in a different pipeline
stage
Using leaf pushing trie
The longest matching prefix is always in the leaf
of the traversed path
Using write bubble to update
Each bubble consists of a sequence of( stage,
location, value) triples, 1 triple for 1 stage
Minimizing the number of write bubble can reduce
the disruption to the lookup process

7
SOLVING the PIPELINED ARCHITECTURE PROBLEM

Forwarding engine model
A trie component that constructs and updates the
routing trie
Packing component that packs writes from a batch
of consecutive route updates into write bubbles
that are sent down the pipeline
pipeline component that actually simulates the
traversal of these write bubbles through a
multi-stage pipeline.

8
Forwarding engine model
9
SOLVING the PIPELINED ARCHITECTURE PROBLEM

Assumptions
The initial trie construction takes as input a
snapshot of the entire table
Bubbles are processed by the pipeline in the same
order as they are generated by the packing
component
Only tries with fixed strides are considered
Focus on leaf-pushed tries
Writes to different pipeline stages can be
combined into a single write bubble
The packing component is permitted to pack
pipeline writes from multiple route updates into
a single write bubble
focus on IPv4 lookups
The next hop information is stored in a separate
Next Hop table that is distinct from the
pipelined trie.

10
Routing table Observation

Because 24 bit prefix dominate the routing table
nowadays
Most routing update effect 24 bit prefix
The number of short prefixes is very low.
However, the modifications in each update is
large( first level often has stride of 12-16 bits
)
The address blocks allocated to an ISP customer
are sub-blocks of the address block allocated to
the ISP
Prefixes corresponding to the customers of a
given ISP are typically neighboring 24-bit
prefixes
A link failure (recovery) in an ISP network
disconnects (
reconnects) some or all of its customer networks
(represented by neighboring prefixes in the
routing trie).
Large proportion of routes that are withdrawn get
added back a few minutes later

11
Memory optimization

Designing non-pipelined tries
Use controlled prefix expansion to construct
memory-efficient tries for the set of prefixes in
a routing table( using DP )
controlled prefix expansion
Node(i) number of nodes at level I
If we terminate at bit position i, next level is
at bit position j, j gt I
gt node( i 1 ) lt 2( j i )
T j, r gt memory requirement for j 1 bits,
r level

12
Designing non-pipelined tries

Here, we choose to terminate the (r-1)th level,
at position m to minimize the total memory

13
Implications for memory usage and update
performance

CPE doesnt attempt to equally distribute the
memory across stages

14
A New Algorithm for Pipelined Architectures

The new algorithm, MinMax, is based on CPE
Constraints
Each level in the fixed-stride must fit in a
single pipeline stage
The maximum memory allocated to a stage (over all
stages) is minimized.
The total memory used is minimized subject to the
first two constraints

15
A New Algorithm for Pipelined Architectures

The 1th and 3rd constraints are satisfied by
following equations

16
A New Algorithm for Pipelined Architectures

Memory allocated to the rth level in the
multi-bit trie
Maximum memory allocated to any trie level
Find p minimum value of above function

17
A New Algorithm for Pipelined Architectures

Appling to constraints
Main goal reduce max memory across stages
A memory-efficient trie typically has smaller
strides and hence less replication of routes in
the trie

18
A New Algorithm for Pipelined Architectures

Worst Case Memory Bound
The max memory per stage in k level is

19
Performance
20
Reducing write bubble

Four optimization methods to achieve the goal
Separating out updates to short routes
Node pull-ups
Eliminating excess writes
Caching deleted subtrees

21
Separating out update to short routes

Separating out updates to short routes
Ex the addition of 7-bit route can cause up to
211 writes( stride of 16 )

22
Node pull-up
23
Node pull-up

State trie
The pullup information( in the form of a changed
stride length) is stored in the node where the
pullup has occurred.
Software-based state trie can store the
information

24
Eliminating excess writes

Neighboring routes are often added in the same
timestamp
Add

25
Eliminating excess writes

Withdraw

26
Caching deleted subtrees
27
Caching deleted subtrees

When a route withdrawal causes a sub-tree to be
deleted, the trie component caches the sub-tree
in software and remembers the location of the
cached trie in the pipeline memory
Therefore, the only information that must be
stored with the cached subtree is the prefix that
was pushed down, and the last route in the
subtree that was withdrawn

28
Caching deleted subtree