Title: RAIDM: Router-based Anomaly/Intrusion Detection and Mitigation
1RAIDM Router-based Anomaly/Intrusion Detection
and Mitigation
- Zhichun Li
- EECS Deparment
- Northwestern University
- 2008-04-29
Thesis Proposal
2Outline
- Motivation
- RAIDM System Design
- Finished Work
- Proposed Work
- Research Plan
3Motivation
Attackers
Botnets
Worms
4Motivation
- Network security has been recognized as the
single most important attribute of their
networks, according to survey to 395 senior
executives conducted by ATT. - Many new emerging threats make the situation even
worse.
RAIDM Network-based attack defense system
5Network Level Defense
- Network gateways/routers are the vantage points
for detecting large scale attacks - Only host based detection/prevention is not
enough for modern enterprise networks. - Enterprises might not only want to reply on their
end user for security protection - User might not want to stop their work to reboot
machines or applications for applying patches.
6Outline
- Motivation
- RAIDM System Design
- Finished Work
- Proposed Work
- Research Plan
7Research Questions
- How can we achieve online anomaly detection for
high-speed networks? - How can we respond to zero-day polymorphic worms
in their early stage? - Given vulnerabilities, how to protect the
high-speed networks from exploits, accurately and
efficiently? - How can we provide quality information for
network situational awareness?
8System Framework
9Current Status
- Part I Sketch based monitoring detection
- Result in Infocom06,ToN,ICDCS06
- Part II Polymorphic worm signature generation
- Result in Oakland06,ICNP07
- Part III Signature matching engines
- Work in progress, will be focus of this talk
- Part IV Network Situational Awareness
- Work in process
10Outline
- Motivation
- RAIDM System Design
- Finished Work
- Proposed Work
- Research Plan
11 Part I Sketch based monitoring detection
- Reversible Sketches (include for completeness)
- Use intelligent hash function design to recover
the aggregated value of a series (key,value)
updates for the popular keys. - Publications
- Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik,
Reversible sketches Enabling monitoring and
analysis over high speed data streams, in the
IEEE/ACM Transaction on Networking, Volume 15,
Issue 5, Oct, 2007 - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse
Hashing for High-speed Network Monitoring
Algorithms, Evaluations, and Applications, in the
Proc. Of IEEE INFOCOM 2006 (252/140018)
12 Part I Sketch based monitoring detection
- Sketch-based Anomaly Detection
- Build anomaly detection engines based on
reversible sketches to detect horizontal scan,
vertical scan, and TCP SYN flooding attacks. - Further proposed 2D sketches to differentiate
the different types of attacks. - Publications
- Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient
Flow-level Intrusion Detection Approach for
High-speed Networks, In Proc. Of IEEE
International Conference on Distributed Computing
Systems (ICDCS) 2006 (75/53614) (Alphabetical
order)
13Part II Polymorphic worm signature generation
- TOSG (Token-Based Signature Generation)
- Use token (substring) conjunction as the
signature for polymorphic worms - Advantage
- Do not require protocol knowledge or the
information about the vulnerable program - Fast and noise tolerant
- Have analytical attack resilience bound under
certain assumptions. - Limitation
- Do not have good attack resilience to the
deliberate noise injection attack Perdisci 2006 - PublicationZhichun Li, Manan Sanghi, Brian
Chavez, Yan Chen and Ming-Yang Kao, Hamsa Fast
Signature Generation for Zero-day Polymorphic
Worms with Provable Attack Resilience, in Proc.
of IEEE Symposium on Security and Privacy, 2006
(23/2519)
14Part II Polymorphic worm signature generation
- LESG (Length-Based Signature Generation)
- Propose to use a set of field lengths of the
protocol of vulnerable program as signatures. - Mainly work for buffer overflow worms
- Advantage
- Fast and noise tolerant
- Have analytical attack resilience bound under
certain assumptions - The bound hold under all the recently proposed
attacks. - PublicationZhichun Li, Lanjia Wang, Yan Chen and
Zhi (Judy) Fu, Network-based and Attack-resilient
Length Signature Generation for Zero-day
Polymorohic Worms, in the Proc. of IEEE
International Conference on Network Protocols
(ICNP) 2007 (32/22014)
15Outline
- Motivation
- RAIDM System Design
- Finished Work
- Proposed Work
- Research Plan
16Proposed Work
- Part III Signature Matching Engine
- NetShield, a protocol semantic vulnerability
signature matching engine. (focus on this talk) - ReportZhichun Li, Gao Xia, Yi Tang, Ying He, Yan
Chen and Bin Liu, NetShield Towards High
Performance Network-based Semantic Signature
Matching
17Proposed Work
- Part IV Network Situational Awareness
- Botnet Inference
- Infer scan properties based on honeynet traffic
trend, uniform, hitlist, and collaboration - Extrapolate the global scan scope and global
number of bots based on limited local
observation. Can be used to detect target
attacks. - ReportZhichun Li, Anup Goyal, Yan Chen and Vern
Paxson, Towards Situational Awareness of
Large-Scale Botnet Events using Honeynets - P2P Misconfiguration Diagnosis
- Found P2P misconfiguration traffic is one of the
major source of Internet background radiation - eMule P2P misconfiguration is due to byte
ordering - For BitTorrent, we found anti-P2P company
deliberately inject bogus peers - ReportZhichun Li, Anup Goyal, Yan Chen and
Aleksandar Kuzmanovic, P2P Doctor Measurement
and Diagnosis of Misconfigured Peer-to-Peer
Traffic
18NetShield Overview
- Goal
- Feasibility Study a Measurement Approach
- High Speed Parsing
- High Speed Matching for Large Rulesets
- Preliminary Evaluation
- Discussion
19Signature Matching Engine
- Accuracy (especially for IPS)
- False positive
- False negative
- Speed
- Coverage Large ruleset
Regular Expression Vulnerability
Accuracy Poor Much Better
Speed Good Good
Coverage Good Good
20Reason
Shield
RE
X
Cannot express exact condition
Can express exact condition
- Regular expression is not power enough
- to capture the exact vulnerability condition!
21Feasibility Study
- Protocol semantic can help (Shield project
SIGCOMM04) - How much for NIDS/IPS?
- Given a NIDS/NIPS has a large ruleset
- What percent of the rules can use protocol
semantic vulnerability signature to improve?
22Measure Snort rules
- Semi-manually classify the rules.
- First by CVEID
- Manually look at each vulnerability
- Results
- 86.7 of rules can be improved by protocol
semantic vulnerability signatures. - 9.9 of rules are web DHTML and scripts related
which are not suitable for signature based
approach. - On average 4.5 Snort rules reduce to one
vulnerability signature - Binary protocols have large reduction ratio than
text based protocols.
23Towards high speed parsing
- Protocol parsing problem formulation
- Given a PDU and the previous states from previous
PDU, output the set of fields which required by
matching. - Observation
- Parsing State Machine
24Observation
- PDU ? parse tree
- Leaf nodes (basic fields ) are integer or string
- Vulnerability signature mostly based on basic
fields
Only need to parse out the field related to
signatures
25Parsing State Machine
- Studied eight popular protocols HTTP, FTP, SMTP,
eMule, BitTorrent, WINRPC, SNMP and DNS and
vulnerability signatures. - Protocol semantics are context sensitive
- Common relationship among basic fields.
26Example for WINRPC
- Nodes
- States S1 .. Sn
- 0.61 instruction/byte for BIND PDU
27High speed matching
- Problem formulation
- Observation
- Candidate Selection Algorithm
- Algorithm Refinement
28Matching Problem Formulation
- Data presentation
- For all the vulnerability signartures we studied
we need integers and strings - Integer operator , gt, lt
- String operator , match_re(.,.), len(.),
- Buffer constraint
- The string fields could be too long to buffer.
- Influence whether we can change the matching
order - Field dependency
- Array (e.g., DNS_questions, or RR records)
- Associate array (e.g., HTTP headers)
- Mutual exclusive fields.
29Matching Problem Formulation (2)
- PDU level protocol state machine
- For complex stateful protocols
- For most stateful protocols the state machine is
quite simple
WINRPC example
30Matching problems (cont.)
- Example signature for Blaster worm
- Single PDU matching problem (SPM)
- Multiple PDU matching problem (MPM)
31Single PDU Matching
- Suppose we have n signatures, each is defined on
k matching dimensions (matchers) - Matcher is a two tuple (field, operation) or four
tuple for the associate array elements. - For example
- (Filename, RE)
- (Version, Range_check)
- Version gt 3
- Version 1
- k is all possible matchers for the n signatures.
32Table Representation
- We use a nk table to represent the rules.
k matchers
matcher j
Sig i
n row signatures
33Requirement for SPM
- Large number of signatures n
- Large number of matchers k
- Large number of dont cares
- Cannot reorder the matchers arbitrarily (buffer
constraint) - Field dependency
- Array
- Associate Array
- Mutually exclusive Fields.
34Compare to packet classification
- Similarity both problem define on k matching
dimensions and allow wildcards - Differences
- Large k and large number of dont cares
- Buffer constraint
- Regular expression matcher
- Field dependency
- Related work on packet classification
- Exhaustive search
- Decision tree
- Tuple space
- Divide and Conquer (Decomposition)
35Difficulty
- A more complex problem than packet classification
- Packet classification theoretical worst case
bound - Based on computational geometry
- O ((logN)k-1) worst case time or O (Nk) worst
case memory - Solution use the characteristics from real
traces
36Observation
- Observation 1 most matchers are good.
- After matching against them, only a small number
of signatures can pass (candidates). - String matchers are all good, most integer
matchers are good. - We can buffer the bad matchers to change the
matching order - Observation 2 real world traffic mostly does not
match any signature. Actually even stronger in
most case no matcher will match any rule. - Observation 3 the NIDS/IPS will report all the
matched rules regardless the ordering. Differ
from firewall rules.
37Basic idea
- Decide the matcher order at pre-computation,
buffer the bad ones to the end if possible - When a PDU comes, match again each matcher
(column) for all the signatures simultaneously
and get the possible candidates for next step - Combine the candidate sets together to get the
final matched signatures
38Match single matcher
- Integer range checking Binary search tree
- String exact matching Trie
- String regular expression matching DFA.
- String length checking Binary search tree
39Candidate Selection for SPM
- Basic algorithm pre-computation
40Matching Illustration
A2 candidates
B2 candidates
41Matching Illustration
- Compute the operations
- Explicit calculation
- Based on a nk Bitmap decide the whether an
element in Si requires next matchers. - For those requires next matchers, search whether
it is also in Ai1 - Implicit calculation (for bad matchers)
- Do not calculate Ai1 , since it could be large
- Check whether the candidates in Si can match
matcher (i1) sequentially - When buffer bad matchers to the end, the B will
be small.
42Refinement
- SPM improvement
- Allow negative conditions
- Handle array case
- Handle associate array case
- Handle mutual exclusive case
- Report the matched rules as early as possible
- Extend to MPM
- Allowing checkpoints.
43Results
- Traces from Tsinghua Univ. (TH) and Northwestern
Univ. (NU) - After TCP reassembly and preload the PDU in
memory - For DNS we only evaluate parsing.
- For WINRPC we have 45 vulnerability signatures
which covers 3,519 Snort rules - For HTTP we have 791 vulnerability signatures
which covers 941 Snort rules.
44Discussion
- Currently we found the candidate selection
algorithm works well in practice - Further thoughts
- How to rely more on hardware assistance?
- TCAM?
- Use bitmap to express set operations?
- Whether we can consider the traffic statistics to
further improve efficiency?
45Outline
- Motivation
- RAIDM System Design
- Finished Work
- Proposed Work
- Research Plan
46Publications
- Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy)
Fu, Network-based and Attack-resilient Length
Signature Generation for Zero-day Polymorohic
Worms, in the Proc. of IEEE ICNP 2007. - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik,
Reversible sketches Enabling monitoring and
analysis over high speed data streams, in the
IEEE/ACM Transaction on Networking, Volume 15,
Issue 5, Oct, 2007 - Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen
and Ming-Yang Kao, Hamsa Fast Signature
Generation for Zero-day Polymorphic Worms with
Provable Attack Resilience, in Proc. of IEEE
Symposium on Security and Privacy, 2006 - Zhichun Li, Yan Chen and Aaron Beach, Towards
Scalable and Robust Distributed Intrusion Alert
Fusion with Good Load Balacing, in Proc. of ACM
SIGCOMM LSAD 2006 - Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient
Flow-level Intrusion Detection Approach for
High-speed Networks, In Proc. Of IEEE ICDCS 2006 - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse
Hashing for High-speed Network Monitoring
Algorithms, Evaluations, and Applications, in the
Proc. Of IEEE INFOCOM 2006
47Research Time Plan
- Apr 2008 Jun 2008
- Finish remaining experiments of network
situational awareness - Sep 2008 Mar 2008
- Refine the vulnerability signature matching
algorithm - Fully implement, deploy and evaluate the
Netshield prototype - Prepare job application and interview
- Apr 2009 Jun 2009
- PhD dissertation writing
- Thesis Defense
48 49Backup
50Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
51Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
52Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
53Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for a large number of
vulnerability Signatures. - Evaluation
- Conclusions
54Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
55Limitations of Regular Expression Signatures
Signature 10.01
Traffic Filtering
Internet
Our network
X
X
Polymorphism!
Polymorphic attack (worm/botnet) might not have
exact regular expression based signature
56What we do?
- Build a NIDS/NIPS with much better accuracy and
similar speed comparing with Regular Expression
based approaches - Feasibility Snort ruleset (6,735 signatures)
86.7 can be improved by vulnerability
signatures. - High speed Parsing 2.712 Gbps
- High speed Matching
- Efficient Algorithm for matching massive
vulnerability rules - HTTP, 791 vulnerability signatures at 1Gbps
57Problem Formulation
- Parsing problem formulation
- Given a PDU and the protocol specification as
input, output the set of fields which required by
matching.
58Publications
- Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy)
Fu, Network-based and Attack-resilient Length
Signature Generation for Zero-day Polymorohic
Worms, in the Proc. of IEEE ICNP 2007. - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik,
Reversible sketches Enabling monitoring and
analysis over high speed data streams, in the
IEEE/ACM Transaction on Networking, Volume 15,
Issue 5, Oct, 2007 - Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen
and Ming-Yang Kao, Hamsa Fast Signature
Generation for Zero-day Polymorphic Worms with
Provable Attack Resilience, in Proc. of IEEE
Symposium on Security and Privacy, 2006 - Zhichun Li, Yan Chen and Aaron Beach, Towards
Scalable and Robust Distributed Intrusion Alert
Fusion with Good Load Balacing, in Proc. of ACM
SIGCOMM LSAD 2006 - Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient
Flow-level Intrusion Detection Approach for
High-speed Networks, In Proc. Of IEEE ICDCS 2006 - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse
Hashing for High-speed Network Monitoring
Algorithms, Evaluations, and Applications, in the
Proc. Of IEEE INFOCOM 2006
59Current Status
- Part I Sketch based monitoring detection
- Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik,
Reversible sketches Enabling monitoring and
analysis over high speed data streams, in the
IEEE/ACM Transaction on Networking, Volume 15,
Issue 5, Oct, 2007 - Robert Schweller, Zhichun Li, Yan Chen, Yan Gao,
Ashish Gupta, Elliot Parons, Yin Zhang, Peter
Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse
Hashing for High-speed Network Monitoring
Algorithms, Evaluations, and Applications, in the
Proc. Of IEEE INFOCOM 2006 (252/140018) - Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient
Flow-level Intrusion Detection Approach for
High-speed Networks, In Proc. Of IEEE
International Conference on Distributed Computing
Systems (ICDCS) 2006 (75/53614) (Alphabetical
order) - Part II Polymorphic worm signature generation
- TOSG Zhichun Li, Manan Sanghi, Brian Chavez, Yan
Chen and Ming-Yang Kao, Hamsa Fast Signature
Generation for Zero-day Polymorphic Worms with
Provable Attack Resilience, in Proc. of IEEE
Symposium on Security and Privacy, 2006
(23/2519) - LESG Zhichun Li, Lanjia Wang, Yan Chen and Zhi
(Judy) Fu, Network-based and Attack-resilient
Length Signature Generation for Zero-day
Polymorohic Worms, in the Proc. of IEEE
International Conference on Network Protocols
(ICNP) 2007 (32/22014)
60Current Status
- Part III Signature matching engines
- Work in progress, will be focus of this talk
- Zhichun Li, Gao Xia, Yi Tang, Jian Chen, Ying He,
Yan Chen and Bin Liu, NetShield Towards High
Performance Network-based Semantic Signature
Matching, in submission - Part IV Network Situational Awareness
- Work in process
- Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson,
Towards Situational Awareness of Large-Scale
Botnet Events using Honeynets, in preparation - Zhichun Li, Anup Goyal, Yan Chen and Aleksandar
Kuzmanovic, P2P Doctor Measurement and Diagnosis
of Misconfigured Peer-to-Peer Traffic, in
submission