ECE 526 - PowerPoint PPT Presentation

About This Presentation
Title:

ECE 526

Description:

ECE 526 Network Processing Systems Design Network Security: string matching algorithm Chapter 17: George Varghese Goal Gain basic knowledge to improve network ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 25
Provided by: Ning86
Learn more at: https://www.engr.siu.edu
Category:

less

Transcript and Presenter's Notes

Title: ECE 526


1
ECE 526 Network Processing Systems Design
  • Network Security  string matching algorithm
  • Chapter 17 George Varghese

2
Goal
  • Gain basic knowledge to improve network security
    from network processing system design perspective

3
Outline
  • Signature-based IDSs
  • String matching algorithms
  • Boyer-Moore
  • Aho-Corasic
  • Bloom Filter
  • Approximated Searching
  • Approximated Searching Based on Bloom Filters
  • Summary

4
Internet Security
  • Internet lacking of security
  • Example?
  • What is Internet Security
  • Confidentiality data keeping private
  • Integrity protected from modification or
    destruction
  • Availability data or service accessible
  • What are current approaches
  • Engineering?
  • non-engineering?
  • Intrusion Detection Systems (IDSs)

5
Intrusion Detection Systems
  • Two types of Intrusion Detection Systems (IDSs)
  • Signature detection based on matching events to
    the signatures of known attacks
  • Anomaly detection based on statistical or
    learning theory to identify aberrant events
  • Three important tasks
  • String matching searching suspicious strings in
    packet payloads
  • Traceback to detect intruder who uses forged
    source address
  • Detect onset of new worm without prior knowledge
  • The problems of current IDSs
  • Very slow
  • Have a high false-positive rate
  • false positive answering membership query
    positively when member is not in the set

6
Snort Rule Example
  • Snort
  • one of lightweight detection system, open source
  • www.snort.org
  • Snort rule example
  • Alert tcp BAD 80 -gt GOOD 90 \
  • (content perl.exe msg detected perl.exe)
  • Looking for string perl.exe contained in TCP
    packet from IP BAD, Port 80 to IP GOOD,
    Port 90
  • Upon detection, generating alert with detected
    perl.exe
  • Question a packet coming, how to check it?
  • Question how about multiple rules?
  • String matching is bottleneck

7
String Searching brute force
  • Arbitrary string can be anywhere in the packet
  • Naive approach
  • Input String size m packet size n (assuming n
    gtm)
  • For i0 to n-m do
  • For j0 to m-1 do
  • Compare stringj with packetij
  • If not equal exit the inner loop
  • Complexity
  • worst case O(mn)
  • Best case O(n)
  • Can we do better?

8
Boyer-Moore example
  • Improving by skipping over a larger number of
    character and by comparing last character first
  • How to build the ship table?

9
Boyer Moore skip table
  • How far to skip when the last character does not
    match.
  • For example
  • pattern CAB
  • Skip 1 2 3 3
  • Last A B C D E
  • Care is needed with repeated letters
  • For example
  • pattern ABBA
  • Skip 1 4 4 4
  • Last A B C D E
  • Skipc distance of last occurrence of c from
    end in pattern

10
Boyer Moore algorithm
  • Input pattern with size m packet with size n
  • i 0
  • While iltn-m do
  • If patternm-1 packetim-1 then //last
    character first
  • For j0 to m 1 do
  • Compare patternj with packetij //one by one
    sequentially
  • ii1
  • Else iiskippacketim-1 //skip
  • Complexity
  • best case O(n/m)
  • worst case still O(nm)

11
Aho-Corasic
  • Failure pointer
  • Prevent restarting at top of trie when failure
    occurring
  • New attempt made by shifting
  • How about multiple strings?

BABAR
12
Multiple String Trie Construction
Example P he, she, his, hers
13
Aho-Corasick Searching
Matching String
Input stream
  • Scanning input stream only once
  • Complexity linear time
  • .

h
x
h
e
r
s
14
Aho-Corasick summary
  • Pros
  • Computation complexity worst case O(n)
  • Can scan once and output all matches
  • Cons
  • Constructing a finite state machine
  • Failure pointers needed
  • Too big to be on chip
  • Each node has maximum 256 pointers

15
Hashing
  • One efficient set membership query mechanism
  • Programming trivial
  • Query complexity O(n) best case (n size of
    packet)
  • Query accuracy possible false positive
  • However, to handle collision
  • Each hash entry containing a list of IDs of all
    elements share the hash value
  • Storage minimal requirement O(nw) n number of
    elements, w minimal width of each element
  • Question can we trade accuracy for storage
    requirement using hashing idea?

16
Bloom Filter
  • Data structured proposed by Burton Bloom
  • Randomized data structure
  • Strings stored using multiple hash functions
    (programming)
  • Check strings presence based on multiple bits
    (querying)
  • Membership queries result in false positives
  • Powerful tools for
  • Content networks
  • Route trace back
  • Network measurements
  • Intrusion Detection

17
Bloom Filter Programming
  • Instead using one hash function, k independent
    hash functions
  • Instead requiring nw bit storage m-bit vector
    required
  • Initially all bit are cleared
  • Programming set bit based on each hashing
    function
  • bit remaining set if two elements hashed to same
    position

18
Bloom Filter Querying
  • Procedure
  • String x is computed by k hashing functions
  • Each hashing function pinpointing one bit in
    m-bit vector
  • All value in m-bit vector are ANDed
  • If match 0,
  • x is not a member
  • else
  • x is positive member

19
Bloom Filter false positive rate
  • n number of strings to be stored
  • k number of hash functions
  • m the size of bit array
  • The false positive probability
  • f (1/2)k
  • Optimal value hash functions k
  • K ln2 m/n 0.693m/n
  • False positive rate decreases exponentially with
    number of hash functions memory

20
Counting Bloom Filters
  • Member deletion
  • Deletion of a member requiring clearing all the
    related bits
  • A bit once set in the bit vector can not be
    deleted easily
  • the bit can be set by multiple members
  • Solution
  • Assuming member deletion rare case
  • Counting bloom filter
  • Updating counter when element added or deleted
  • Bit reset in m-bit vector when counter value is 0

21
Approximate String Searching
  • Using Bloom filter

22
Approximate String Searching
John W. Lockwood and etc. DEEP PACKET INSPECTION
USING PARALLEL BLOOM FILTERS
23
Summary
Idea Computation Storage Problem
Brute Force Naïve O(mn) slow
Boyer-Moore Skip O(mn) worst O(n/m) best 0.1 MB (10K Rules) Shift table needed
Aho Corasick Tire O(n) worst case 50 MB (1500 Rules) Storage demanding
Bloom-Filter Approximate searching O(n) 0.1 MB (10K Rules) False positive
24
For Next Class
  • Read Comer chapter 6 and 9
  • Final Project (option 1)
  • Project group finalized
  • 9/19/07 group leader email me your group
    members .
  • each group no more than 3 members.
  • Project topic finalized.
  • 9/28/07 Group leader email me your topic.
  • Paper presentation Final exam (Option 2)
  • 9/19/07 group leader email me your group
    members .
  • each group no more than 2 members.
  • based on assigned one or two papers (lt20 min)
Write a Comment
User Comments (0)
About PowerShow.com