Title: SSA: A Power and Memory Efficient Scheme to MultiMatch Packet Classification
1SSA A Power and Memory Efficient Scheme to
Multi-Match Packet Classification
- Fang Yu1 T. V. Lakshman2
- Marti Austin Motoyama1 Randy H. Katz1
- 1EECS Department, UC Berkeley , 2Bell
Laboratories, Lucent Technologies
2Outline
- Introduction to multi-match classification
- Multi-match classification using TCAM
- May consume a large amount of TCAM memory
- May consume high power
- Set Splitting Algorithm (SSA)
- A memory and power efficient scheme for
multi-match classification - Simulation results
- Conclusions
3Packet Classification
- Single-Match classification
- Assumption all the filters are associated with
priorities - Only the highest priority match matters
- E.g., longest prefix match
Packet header
Packet Payload
- Multi-Match classification
- Report all matching results
- No priority among filters
- Intrusion detection system identify all the
related rules - Also required by accounting applications
4Ternary-CAM (TCAM)
- Fully associative memory compare input string
with all the entries in parallel - For multiple matches, report the index of the
first match - Each cell takes one of three logic states
- 0, 1, and ?(dont care)
cell
entry
width
5Challenges of Multi-match Classification using
TCAM
- High speed
- TCAM is fast e.g., 4 ns, However, TCAM only
returns the first match result - We want all the matching results within a few
cycles - If returning a bit vector of the matching result?
- Processing the bit vector can take time if the
bit vector is long - Not efficient it is a sparse vector in most of
the cases
- Memory efficient
- 9Mbits 18Mbits priced at 200-300
- Power efficient
- Easy update
6Previous Solutions Geometric Intersection-based
Solution Hot Interconnects 04
- Add additional intersection filters
- High speed
- Return all the matching results within one cycle
- Memory efficient
- Create 10N intersection filters for the Snort
rule set - May create O(NF) intersection filters in the
worst case - Energy efficient
- Easily updatable
7Previous Solution MUD Sigcomm05
- Encode the index of the entry and include the
encoded value in each TCAM entry - Search the TCAM with initial MUD as all dont
cares - After finding a matching result at index j,
search again with discriminator field value j
8Previous Solution MUD (Cont.)
- High speed
- 1d(k-2)(d-1) O(dk) TCAM lookups to get k
matching results - d is the logarithm of the number of entries in
TCAM (dlog2N) - Decreased to 1d(k-1)/r with DIRPE, where r
(smaller than d) - Memory efficient
- Energy efficient
- All the entries in TCAMs are accessed each time ?
high power consumption. - Easily updatable
- Our Goal Find a memory and power efficient
solution
9Observation
Original
Two sets
FN
Matching FN
F1
Matching F1 and FN
Matching F1
N filters O(N2) intersection 1 TCAM lookup
N filters 1 intersection 2 TCAM lookups
- Split filters into two sets to reduce
intersections - Report the union of results from all sets
- No need to include the intersections of the
filters from different sets - Decrease the number of filters in TCAM, decrease
power consumption - Increase the number of TCAM access
10Problem Definition
- Given a set of filters F(F1,F2, ., FN)
- Filters create a set of intersections I(I1,I2,
., IM) - e.g., I1 intersection of (F1, F5, F6)
- How to divide the filters into several sets
- Residual intersection set I intersections from
filters in the same set - N I
- Number of sets (TCAM accesses) is minimum
- NP hard problem!
11Split Rules into Two Sets
- Still an NP hard problem (known as maximum set
splitting or maximum hypergraph cut ) - Best known approximation algorithms
- Yield a performance ratio of 0.72 to the optimum
solution - Require quadratic programming ?slow when the
number of filters is large - Our SSA algorithm
- Remove at least half of the intersections
- O(NM) complexity, where N is the total number of
filters, and M is the total number of
intersections
12Maximum Satisfiability Problem
- Maximum Satisfiability Problem
- A set of literals F1, F1, F2, F2,.., FN, FN
- A set of clauses, each clause is a subset of
literals - E.g., C1F1 F5 F6
- Goal Find an assignment of F to satisfy a
maximum number of clauses
13Johnsons Algorithm to Maximum Satisfiability
Problem
- Assign each clause a weight 2-c
- E.g., weight of C1F1, F5 F6 is 2-3
- Let Fi be any literal which hasnt been assigned
a value yet - If the weight of all clauses containing Fi is
higher than those containing Fi - Assign Fi a true value and remove all clauses
containing Fi - Multiply the weight of all the clauses containing
Fi by 2 - Otherwise
- Assign Fi a false value and remove all clauses
containing Fi - Multiply the weight of all the clauses containing
Fi by 2
14Johnsons Theorem
- If all the clauses have at least k literals
- Johnsons algorithm can satisfy at least
- (2K-1)/ 2K percent of the total clauses
- e.g., k2, satisfy at least ¾ of the clauses
- It is proved that (2K-1)/ 2K is the best
approximation bound for k2
15Filter Set Split Algorithm (SSA)
- Convert set splitting problem into maximum
satisfiablity problem - Each filter corresponds to a literal
- For any intersection (e.g., I1 intersection of
F1,, F5, and F6), add two clauses - CF1, F5 F6 and CF1, F5 F6
- Total number of clauses is 2M, M is the number of
intersections - Run Johnsons algorithm and assign each filter Fi
either a true (put in set one) or a false value
(put in set two)
16Filter Set Split Algorithm (SSA) (cont.)
- According to Johnsons theorem
- At least ¾ of the clauses are satisfied ?
2M3/41.5M - ?At least 0.5M of the intersections have both
clauses satisfied - Suppose for intersection of F1,, F5, and F6 ,
CF1 F5 F6 and CF1 F5 F6
both are satisfied - At least one of F1,, F5, F6 is true and at
least one is false - F1,, F5, F6 are split into different sets, thus
this intersection doesnt need to be presented in
TCAM - ?At least 50 of the intersections are removed!
17Review of the SSA Scheme
- High speed
- Deterministic lookup rate. E.g., if filters are
split into two sets, only 2 TCAM lookups per
packet are needed. - Sets are logically independent ? Lookups can be
parallelized - Memory efficient
- Guarantee the removal of at least 50 of the
intersections each time the filter set is split
into two sets - Energy efficient
- Low memory requirement
- Access each filter only once per packet
- Easily updatable
- Updates can be inserted to one of the set that
creates the least number of intersections
18Simulation Setup
- Tests on the Snort rule header sets
- Compare SSA with two TCAM-based solutions
- MUD
- Geometric Intersection-based solution
- Compare SSA with two representative
software-based solutions - Hicuts
- EGT-PC
- Evaluation metrics
- Memory consumption
- Lookup rate
- Power consumption
- Update cost
19Memory Usage
Total number of extra intersections filters in
TCAMs.
Total number of TCAM entries used.
20Classification Speed
- MUD
- One packet may match up to 12 unique filters, and
requires a maximum of 20 TCAM lookups - Common packets like http packets match 4 unique
filters and may require 5-9 TCAM lookups. A
Napster packet requires 9 to 15 TCAM lookups - Geometric Intersection-based solution
- 1 TCAM lookup per packet
- SSA-2
- 2 TCAM lookups per packet
- SSA-4
- 4 TCAM lookups per packet
- If average packet size is 402.7 bytes, SSA-4
operates at 201.35 Gbps classification rate - Worst case, if every packet is 40 bytes, SSA-4
achieves 20Gbps rate
21Update Cost
- Update cost in terms of newly inserted filters
22Power Consumption
- Energy used by a TCAM is linear to
- The number of entries searched in parallel
- The number of TCAM accesses per packet
- Metric total TCAM entries accessed per packet
23Software Solutions
- Hicuts
- A high percentage of wildcards generate a high
degree of filter duplications (on average, on
filter is duplicated to 3108 times) - EGT-PC
- Many Snort rules apply to the same source and
destination addresses - A packet may match 153 filters if we consider
source and destination addresses only ? comparing
input with these filter one by one is not
affordable
24Conclusions
- SSA is a memory and power efficient solution to
multi-match classification problem - O(NM) complexity
- Guarantee to remove 50 of the intersections each
time the filter set splits - Comparing to MUD
- Use a similar amount of TCAM memory
- Yield a 75 to 95 reduction in power consumption
- Comparing to the Geometric Intersection-based
Solution - Use 90 less TCAM memory and power
- Require one additional TCAM lookup per packet