Title: Optimization of Regular Expression Pattern Matching Circuits on FPGA
1Optimization of Regular Expression Pattern
Matching Circuits on FPGA
Authors Cheng-Hung Lin, Chih-Tsun Huang,
Chang-Ping Jiang, and Shih-Chieh
Chang Publisher IEEE TRANSACTIONS ON VERY LARGE
SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO.
12, DECEMBER 2007 Present Chen-Rong
Chang Date November, 12, 2008
Department of Computer Science and Information
Engineering National Cheng Kung University,
Taiwan R.O.C.
2Outline
- Introduction
- Implementation of NFA
- Regular expressions
- Sharing prefix common sub-patterns
- Sharing scheme for infix and suffix
- Flow of RE module generation
- The comparison
3 Introduction
- Regular expressions are widely used in the
network intrusion detection system (NIDS) to
represent attack patterns. - In contrast to software-only NIDS, many studies
proposed hardware architectures for accelerating
attack detection - Sidhu and Prasanna 1 proposed to construct an
NFA (Nondeterministic Finite Automaton) from a
regular expression to perform string matching.
Hutchings, - Clark et al. 3 made excellent area and
throughput by adding predecoded wide parallel
inputs to traditional NFA implementations.
3
4Predecoder Scheme
5Simple NFA and implementation in logic
6Regular expressions for attacks description
- Regular expressions are a common way to express
attack patterns. - In Snort, two type of regular expression are used
to describe attack pattern - 1. The first type defines exact string patterns
such as pattern, "Ahhhh My Mouth Is Open. - 2. The second type consists of meta-characters(
,,,... )
7Regular expressions for attacks description
(cont.)
- Given a regular expression
- A partial expression ,is a
prefix of P if kltm. - A partial expression ,is an
infix of P if jgt1 and kltm - And a partial expression is a
suffix of P if jgt1. - Ex expression gt networking.
- The partial expression net is a prefix, work
is an infix, and ing is a suffix
8Sharing prefix common sub-patterns
9An erroneous implementation to share infix Dir
Input String gt PassSysDirUserGate It may be
mistaken as a match at the output of the upper
blocks Called False positive
10Sharing common Suffix
11Sharing scheme for infix and suffix
12Two patterns share common infix RC
Form R1RcR1 R2RcR2
13Example of critical section problem
Pattern1 abcdefgh Pattern2 dedefpq
14Cross-Subexpression
- Definition An expression ,is called the
cross-subexpression of if is not a
subexpression of and is a subexpression of
- EX
- R1abc , R2def
- cross-subexpression
- cde,cdef,bcd,bcde,bcdef
15Necessary Condition
- Theorem If has the critical section problem,
either is a cross-subexpression of ,
or I is a cross-subexpression of - EX R1 abc
- R2 cde
- RC defgh
- As long as R1 or R2 is a cross-subexpression, the
critical section problem will happen.
16Sharing gain
- The sharing gain of a common sub-pattern is
defined to be the number of characters in the
sub-pattern multiplies by the number of regular
expressions having the sub-pattern. - For example, three regular expressions,
1Common1, 2Common2, and 3Common3 have the
common sub-pattern Common. The sharing gain of
the common sub-pattern is - 6318
17Flow of regular expression modulegeneration
18Logical structures for the proposedmeta-character
components
19Logical structures for the proposedmeta-character
components (cont.)
20Logical structures for the proposedmeta-character
components (cont.)
21Implementation of NFA
22The comparison among different approaches on
Snort rule sets