Title: Towards High Performance Network Defense
1Towards High Performance Network Defense
- Zhichun Li
- EECS Department
- Northwestern University
2Motivation
Attackers
Botnets
Professional attackers exploit networks for
profit
Worms
3Network Level Defense
- Network gateways/routers are the vantage points
for detecting large scale attacks - Only host based detection/prevention is not
enough - Some users do not apply the host-based schemes
due to the reliability, overhead, and conflicts - Many users do not update or patch their system on
time - E.g., Conficker worm in the end of 2008 infected
915 millions of hosts - Cannot only reply on end users for security
protection
4Challenges
- Scalable to high speed networks with a large
number of users - Highly accurate
- Adapt fast to the emerging threats
- Have good attack coverage
5Network-based Intrusion Detection, Prevention,
and Forensics System
Scalability
(I) Sketch based monitoring detection
Accuracy Scalability Coverage
Accuracy adapt fast
(III) Signature matching engines
(II) Polymorphic worm signature generation
Packet streams
(IV) Network situational awareness
Accuracy adapt fast
6High-speed Network Monitoringand Anomaly
Detection
- Online traffic monitoring and recording
- SIGCOMM IMC 2004, INFOCOM 2006, ToN 2007
INFOCOM 2008 - Reversible sketch for data streaming computation
- Record millions of flows (GB traffic) in a few
hundred KB - Small of memory access per packet
- Scalable to large key space size (232 or 264)
- Online sketch-based flow-level anomaly detection
- IEEE ICDCS 2006 Journal of Computer Networks
2010 IEEE CGA, Security Visualization 2006 - Online stealthy botnet scan detection
- IEEE IWQoS 2007
7Network and Distributed System Diagnosis
- Overlay network monitoring and diagnosis SIGCOMM
IMC 2003, SIGCOMM 2004, ToN 2007 SIGCOMM 2006 - End-user network diagnosis INFOCOM 2007 (2)
- Internet-scale Virtual Private Network (VPN) and
backbone monitoring and diagnosis INFOCOM 2009 - Internet-scale Data Center and dist system
profiling and diagnosis NSDI 2010
8Polymorphic Worm Signature Generation
- Exploit invariant signature generation IEEE
Symposium on Security and Privacy 2006 (cited by
100, code and test cases release to Columbia U.,
UT Austin, Purdue, Georgia Tech, UC Davis, etc) - Vulnerability signature generation IEEE ICNP
2007, ToN 2010 - NSF CyberTrust 06 Award
Network gateway
Internet
Our network
8
9Online Protocol Parsing and Signature Matching
- NetShield vulnerability signature based NIDS/NIPS
NSF CyberTrust 08 Award under submission
patent filed - Interested by Cisco (IPS ruleset site visit)
- Code release has been used by researchers in
University of Toronto - Using failure information to detect enterprise
zombies SecureCom09 - Spamming botnet detection NSDI09
9
10Network Situational Awareness
- Large-scale botnet and P2P misconfiguration event
situational-aware forensics - Botnet attack target/strategy inference
ASIACCS09 - Root cause analysis of the P2P misconfiguration/po
isoning traffic INFOCOM10 - Analysis of 2TB data across 4 years over 5 /8 IPs
10
11Current Work
- Data center management and configuration
- Internet emergency response
- AS topology study CoNEXT09
- Recovery via IXP Infocom10
- Network based web dynamic vulnerability defense
- Social network security
12NetShield Matching a Large Vulnerability
Signature Ruleset for High Performance Network
Defense
13Outline
- Motivation
- High Speed Matching for Large Rulesets
- High Speed Parsing
- Evaluation
- Research Contributions
13
13
14NetShield Overview
- NIDS/NIPS (Network Intrusion
Detection/Prevention System) operation
NIDS/NIPS
Packets
- Accuracy
- Speed
- Attack Coverage
Security alerts
15State Of The Art
Regular expression (regex) based approaches
Used by Cisco IPS, Juniper IPS, open source Bro
Example .Abc.\x90de\r\n30
- Pros
- Can efficiently match multiple sigs
simultaneously, through DFA - Can describe the syntactic context
16Cons of Regex
Limited expressive power, cannot describe
semantic context, thus inaccurate
Theoretical prospective
Protocol grammar
Practical prospective
- HTTP chunk encoding
- DNS label pointers
17State Of The Art
Vulnerability Signature Wang et al. 04
Blaster Worm (WINRPC) Example BIND rpc_vers5
rpc_vers_minor1 packed_drep\x10\x00\x00\
x00 context0.abstract_syntax.uuidUUID_Remote
Activation BIND-ACK rpc_vers5
rpc_vers_minor1 CALL rpc_vers5
rpc_vers_minors1 packed_drep\x10\x00\x00\x0
0 opnum0x00 stub.RemoteActivationBody.actu
al_lengthgt40 matchRE(stub.buffer,
/\x5c\x00\x5c\x00/)
- Pros
- Directly describe semantic context
- Very expressive, can express the vulnerability
condition exactly - Accurate
- Cons
- Slow!
- Existing approaches all use sequential matching
- Require protocol parsing
18Motivation of NetShield
18
19Motivation
- Desired Features for Signature-based NIDS/NIPS
- Accuracy (especially for IPS)
- Speed
- Coverage Large ruleset
Cannot capture vulnerability condition well!
Shield sigcomm04
Regular Expression Vulnerability
Accuracy Relative Poor Much Better
Speed Good ??
Memory OK ??
Coverage Good ??
19
20Research Challenges and Solutions
- Challenges
- Matching thousands of vulnerability signatures
simultaneously - Sequential matching ?match multiple sigs.
simultaneously - High speed protocol parsing
- Solutions
- An efficient algorithm which matches multiple
sigs simultaneously - A tailored parsing design for high-speed
signature matching
20
21Background
- Vulnerability signature basic
- Use protocol semantics to express vulnerabilities
- Defined on a sequence of PDUs one predicate for
each PDU - Example ver1 methodput len(buf)gt300
- Data representations
- For all the vulnerability signatures we studied,
we only need numbers and strings - number operators , gt, lt, gt, lt
- String operators , match_re(.,.), len(.).
Blaster Worm (WINRPC) Example BIND rpc_vers5
rpc_vers_minor1 packed_drep\x10\x00\x00\
x00 context0.abstract_syntax.uuidUUID_Remote
Activation BIND-ACK rpc_vers5
rpc_vers_minor1 CALL rpc_vers5
rpc_vers_minors1 packed_drep\x10\x00\x00\x0
0 opnum0x00 stub.RemoteActivationBody.actu
al_lengthgt40 matchRE(stub.buffer,
/\x5c\x00\x5c\x00/)
21
22Outline
- Motivation
- High Speed Matching for Large Rulesets
- High Speed Parsing
- Evaluation
- Research Contributions
22
23Matching Problem Formulation
- Suppose we have n signatures, defined on k
matching dimensions (matchers) - A matcher is a two-tuple (field, operation) or a
four-tuple for the associative array elements - Translate the n signatures to a n by k table
- This translation unlocks the potential of
matching multiple signatures simultaneously
Rule 4 URI.Filenamefp40reg.dll
len(Headershost)gt300
RuleID Method Filename Header LEN
1 DELETE
2 POST Header.php
3 awstats.pl
4 fp40reg.dll namehost len(value)gt300
5 nameUser-Agent len(value)gt544
24Matching Problem Formulation
- Challenges for Single PDU matching problem (SPM)
- Large number of signatures n
- Large number of matchers k
- Large number of dont cares
- Cannot reorder matchers arbitrarily -- buffering
constraint - Field dependency
- Arrays, associative arrays
- Mutually exclusive fields.
24
25Difficulty of the SPM
- Bad News
- A well-known computational geometric problem can
be reduced to this problem. - And that problem has bad worst case bound O((log
N)K-1) time or O(NK) space (worst case ruleset) - Good News
- Measurement study on Snort and Cisco ruleset
- The real-world rulesets are good the matchers
are selective. - With our design O(K)
26Matching Algorithms
-
- Candidate Selection Algorithm
- Pre-computation decides the rule order and
matcher order - Decomposition. Match each matcher separately and
iteratively combine the results efficiently
- Integer range checking ? balanced binary search
tree - String exact matching ? Trie
- Regex ? DFA (XFA)
26
27Step 1 Pre-Computation
- Optimize the matcher order based on buffering
constraint field arrival order - Rule reorder
1
Require Matcher 1
Require Matcher 1
Require Matcher 2
Dont care Matcher 1
Dont care Matcher 1 2
n
28Step 2 Iterative Matching
PDUMethodPOST, Filenamefp40reg.dll, Header
namehost, len(value)450
S12 Candidates after match Column 1 (method)
S2
B2
2
444
RuleID Method Filename Header LEN
1 DELETE
2 POST Header.php
3 awstats.pl
4 fp40reg.dll namehost len(value)gt300
5 nameUser-Agent len(value)gt544
R1
R2
R3
28
29Complexity Analysis
Three HTTP traces avg(Si)lt0.04 Two WINRPC
traces avg(Si)lt1.5
- Merging complexity
- Need k-1 merging iterations
- For each iteration
- Merge complexity O(n) the worst case, since Si
can have O(n) candidates in the worst case
rulesets - For real-world rulesets, of candidates is a
small constant. Therefore, O(1) - For real-world rulesets O(k) which is the
optimal we can get
30Refinement and Extension
- SPM improvement
- Allow negative conditions
- Handle array cases
- Handle associative array cases
- Handle mutual exclusive cases
- Extend to Multiple PDU Matching (MPM)
- Allow checkpoints.
30
31Outline
- Motivation
- High Speed Matching for Large Rulesets.
- High Speed Parsing
- Evaluation
- Research Contribution
31
32High Speed Parsing
General V.S. Special Purpose
Keep the whole parse tree in memory
Parsing and matching on the fly
V.S.
Parse all the nodes in the tree
Only signature related fields (leaf nodes)
V.S.
- Design a parsing state machine
- Build an automated parsing state machine generator
33Outline
- Motivation
- High Speed Matching for Large Rulesets.
- High Speed Parsing
- Evaluation
- Research Contributions
33
34Evaluation Methodology
- Fully implemented prototype
- 12,000 lines of C and 3,000 lines of Python
- Release at
- www.nshield.org
- Deployed at a university DC
- with up to 106Mbps
- 26GB Traces from Tsinghua Univ. (TH),
Northwestern (NU) and DARPA - Run on a P4 3.8Ghz single core PC w/ 4GB memory
- After TCP reassembly and preload the PDUs in
memory - For HTTP we have 794 vulnerability signatures
which cover 973 Snort rules. - For WINRPC we have 45 vulnerability signatures
which cover 3,519 Snort rules
34
35Parsing Results
Trace TH DNS TH WINRPC NU WINRPC TH HTTP NU HTTP DARPA HTTP
Avg flow len (B) 77 879 596 6.6K 55K 2.1K
Throughput (Gbps) Binpac Our parser 0.31 3.43 1.41 16.2 1.11 12.9 2.10 7.46 14.2 44.4 1.69 6.67
Speed up ratio 11.2 11.5 11.6 3.6 3.1 3.9
Max. memory per connection (bytes) 15 15 15 14 14 14
35
36Matching Results
11.0
8-core
Trace TH WINRPC NU WINRPC TH HTTP NU HTTP DARPA HTTP
Avg flow length (B) 879 596 6.6K 55K 2.1K
Throughput (Gbps) Sequential CS Matching 10.68 14.37 9.23 10.61 0.34 2.63 2.37 17.63 0.28 1.85
Matching only time speed up ratio 4 1.8 11.3 11.7 8.8
Avg of Candidates 1.16 1.48 0.033 0.038 0.0023
Max. memory per connection (bytes) 27 27 20 20 20
36
37Scalability and Accuracy Results
Rule scaling results
Accuracy
- Create two polymorphic WINRPC exploits which
bypass the original Snort rules but detect
accurately by our scheme. - For 10-minute clean HTTP trace, Snort reported
42 alerts, NetShield reported 0 alerts. Manually
verify the 42 alerts are false positives
Performance decrease gracefully
38Research Contribution
Make vulnerability signature a practical
solution for NIDS/NIPS
Regular Expression Exists Vul. IDS NetShield
Accuracy Poor Good Good
Speed Good Poor Good
Memory Good ?? Good
Coverage Good ?? Good
- Multiple sig. matching ? candidate selection
algorithm - Parsing ? parsing state machine
Build a better Snort alternative!
38
39Future work
Client
Server
Network Security
Data Center Security
- Web/WebSecurity
- WebPropehtNSDI10
- WebShield
40 41Observations
- PDU ? parse tree
- Leaf nodes are numbers or strings
PDU
array
General V.S. Special Purpose
Keep the whole parse tree in memory
Parsing and matching on the fly
V.S.
Parse all the nodes in the tree
Only signature related fields (leaf nodes)
V.S.
41
42Efficient Parsing with State Machines
- Studied eight protocols HTTP, FTP, SMTP, eMule,
BitTorrent, WINRPC, SNMP and DNS as well as their
vulnerability signatures - Common relationships among leaf nodes
- Pre-construct parsing state machines based on
parse trees and vulnerability signatures
Automated parsing state machine generator
UltraPAC
42
43Example for WINRPC
- Rectangles are states
- Parsing variables R0 .. R4
- 0.61 instruction/byte for BIND PDU
43
44Experiences
- Working in process
- In collaboration with MSR, apply the semantic
rich analysis for cloud Web service profiling. To
understand why slow and how to improve. - Interdisciplinary research
- Student mentoring (three undergraduates, six
junior graduates)
45Future Work
- Near term
- Web security (browser security, web server
security) - Data center security
- High speed network intrusion prevention system
with hardware support - Long term research interests
- Combating professional profit-driven attackers
will be a continuous arm race - Online applications (including Web 2.0
applications) become more complex and vulnerable.
- Network speed keeps increasing, which demands
highly scalable approaches.
46Research Contributions
- Demonstrate vulnerability signatures can be
applied to NIDS/NIPS, which can significantly
improve the accuracy of current NIDS/NIPS - Propose the candidate selection algorithm for
matching a large number of vulnerability
signatures efficiently - Propose parsing state machine for fast protocol
parsing - Implement the NetShield
46
47Comparing With Regex
- Memory for 973 Snort rules DFA 5.29GB (XFA 863
rules1.08MB), NetShield 2.3MB - Per flow memory XFA 36 bytes, NetShield 20
bytes. - Throughput XFA 756Mbps, NetShield 1.9Gbps
- (XFA SIGCOMM08Oakland08)
48Measure Snort Rules
- Semi-manually classify the rules.
- Group by CVE-ID
- Manually look at each vulnerability
- Results
- 86.7 of rules can be improved by protocol
semantic vulnerability signatures. - Most of remaining rules (9.9) are web DHTML and
scripts related which are not suitable for
signature based approach. - On average 4.5 Snort rules are reduced to one
vulnerability signature. - For binary protocol the reduction ratio is much
higher than that of text based ones. - For netbios.rules the ratio is 67.6.
48
49Matcher order
Reduce Si1
Enlarge Si1
Merging Overhead Si (use hash table to
calculate in Ai1, O(1))
fixed, put the matcher later, reduce Bi1
50Matcher order optimization
- Worth buffering only if estmaxB(Mj)ltMaxB
- For Mi in AllMatchers
- Try to clear all the Mj in the buffer which
estmaxB(Mj)ltMaxB - Buffer Mi if (estmaxB(Mi)gtMaxB)
- When len(Buf)gtBuflen, remove the Mj with minimum
estmaxB(Mj)
51(No Transcript)
52Backup Slides
53Motivation
- Network security has been recognized as the
single most important attribute of their
networks, according to survey to 395 senior
executives conducted by ATT - Many new emerging threats make the situation even
worse
54Candidate merge operation
54
55 A Vulnerability Signature Example
- Data representations
- For all the vulnerability signatures we studied,
we only need numbers and strings - number operators , gt, lt, gt, lt
- String operators , match_re(.,.), len(.).
- Example signature for Blaster worm
Example BIND rpc_vers5 rpc_vers_minor1
packed_drep\x10\x00\x00\x00
context0.abstract_syntax.uuidUUID_RemoteActivat
ion BIND-ACK rpc_vers5 rpc_vers_minor1 CAL
L rpc_vers5 rpc_vers_minors1
packed_drep\x10\x00\x00\x00
stub.RemoteActivationBody.actual_lengthgt40
matchRE( stub.buffer, /\x5c\x00\x5c\x00/)
55
56System Framework
Scalability
Scalability
Scalability
Scalability
Accuracy Scalability Coverage
Accuracy Scalability Coverage
Accuracy Scalability Coverage
Accuracy Scalability Coverage
Accuracy adapt fast
Accuracy adapt fast
Accuracy adapt fast
Accuracy adapt fast
Accuracy adapt fast
57Example of Vulnerability Signatures
- At least 75 vulnerabilities are due to buffer
overflow - Sample vulnerability signature
- Field length corresponding to vulnerable buffer gt
certain threshold - Intrinsic to buffer overflow vulnerability and
hard to evade
Overflow!
Protocol message
Vulnerable buffer
58Old Slides
59Conclusions
- A novel network-based vulnerability signature
matching engine - Through measurement study on Snort ruleset, prove
the vulnerability signature can improve most of
the signatures in NIDS/IPS. - Proposed parsing state machine for fast parsing
- Propose a candidate selection algorithm for
matching a large number of vulnerability
signature simultaneously
59
60Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
61Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
62Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
63Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for a large number of
vulnerability Signatures. - Evaluation
- Conclusions
64Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
65Limitations of Regular Expression Signatures
Signature 10.01
Traffic Filtering
Internet
Our network
X
X
Polymorphism!
Polymorphic attack (worm/botnet) might not have
exact regular expression based signature
66What we do?
- Build a NIDS/NIPS with much better accuracy and
similar speed comparing with Regular Expression
based approaches - Feasibility Snort ruleset (6,735 signatures)
86.7 can be improved by vulnerability
signatures. - High speed Parsing 2.712 Gbps
- High speed Matching
- Efficient Algorithm for matching massive
vulnerability rules - HTTP, 791 vulnerability signatures at 1Gbps
67Problem Formulation
- Parsing problem formulation
- Given a PDU and the protocol specification as
input, output the set of fields which required by
matching.
68Publications
- Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy)
Fu, Network-based and Attack-resilient Length
Signature Generation for Zero-day Polymorohic
Worms, in the Proc. of IEEE ICNP 2007. - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik,
Reversible sketches Enabling monitoring and
analysis over high speed data streams, in the
IEEE/ACM Transaction on Networking, Volume 15,
Issue 5, Oct, 2007 - Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen
and Ming-Yang Kao, Hamsa Fast Signature
Generation for Zero-day Polymorphic Worms with
Provable Attack Resilience, in Proc. of IEEE
Symposium on Security and Privacy, 2006 - Zhichun Li, Yan Chen and Aaron Beach, Towards
Scalable and Robust Distributed Intrusion Alert
Fusion with Good Load Balacing, in Proc. of ACM
SIGCOMM LSAD 2006 - Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient
Flow-level Intrusion Detection Approach for
High-speed Networks, In Proc. Of IEEE ICDCS 2006 - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse
Hashing for High-speed Network Monitoring
Algorithms, Evaluations, and Applications, in the
Proc. Of IEEE INFOCOM 2006
69Current Status
- Part I Sketch based monitoring detection
- Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik,
Reversible sketches Enabling monitoring and
analysis over high speed data streams, in the
IEEE/ACM Transaction on Networking, Volume 15,
Issue 5, Oct, 2007 - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse
Hashing for High-speed Network Monitoring
Algorithms, Evaluations, and Applications, in the
Proc. Of IEEE INFOCOM 2006 (252/140018) - Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient
Flow-level Intrusion Detection Approach for
High-speed Networks, In Proc. Of IEEE
International Conference on Distributed Computing
Systems (ICDCS) 2006 (75/53614) (Alphabetical
order) - Part II Polymorphic worm signature generation
- TOSG Zhichun Li, Manan Sanghi, Brian Chavez, Yan
Chen and Ming-Yang Kao, Hamsa Fast Signature
Generation for Zero-day Polymorphic Worms with
Provable Attack Resilience, in Proc. of IEEE
Symposium on Security and Privacy, 2006
(23/2519) - LESG Zhichun Li, Lanjia Wang, Yan Chen and Zhi
(Judy) Fu, Network-based and Attack-resilient
Length Signature Generation for Zero-day
Polymorohic Worms, in the Proc. of IEEE
International Conference on Network Protocols
(ICNP) 2007 (32/22014)
70Current Status
- Part III Signature matching engines
- Work in progress, will be focus of this talk
- Zhichun Li, Gao Xia, Yi Tang, Jian Chen, Ying He,
Yan Chen and Bin Liu, NetShield Towards High
Performance Network-based Semantic Signature
Matching, in submission - Part IV Network Situational Awareness
- Work in process
- Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson,
Towards Situational Awareness of Large-Scale
Botnet Events using Honeynets, in preparation - Zhichun Li, Anup Goyal, Yan Chen and Aleksandar
Kuzmanovic, P2P Doctor Measurement and Diagnosis
of Misconfigured Peer-to-Peer Traffic, in
submission
71Current Status
- Part I Sketch based monitoring detection
- Result in Infocom06,ToN,ICDCS06
- Part II Polymorphic worm signature generation
- Result in Oakland06,ICNP07
- Part III Signature matching engines
- Work in progress, will be focus of this talk
- Part IV Network Situational Awareness
- Work in process
72Limitations of Exploit Based Signature
Signature 10.01
Traffic Filtering
Internet
Our network
X
X
Polymorphism!
Polymorphic worm might not have exact exploit
based signature
73Vulnerability Signature
Vulnerability signature traffic filtering
Internet
X
X
Our network
X
X
Vulnerability
- Work for polymorphic worms
- Work for all the worms which target the
- same vulnerability