Network-based Intrusion Detection, Mitigation and Forensics System - PowerPoint PPT Presentation

About This Presentation
Title:

Network-based Intrusion Detection, Mitigation and Forensics System

Description:

Mostly host-based and not scalable to high-speed networks ... Cons: memory usage unscalable to small/medium outdegrees such as bot scans ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 35
Provided by: zhich
Category:

less

Transcript and Presenter's Notes

Title: Network-based Intrusion Detection, Mitigation and Forensics System


1
Network-based Intrusion Detection, Mitigation and
Forensics System
  • Yan Chen
  • Department of Electrical Engineering and Computer
    Science
  • Northwestern University
  • Lab for Internet Security Technology (LIST)
  • http//list.cs.northwestern.edu

2
The Spread of Sapphire/Slammer Worms
3
Current Intrusion Detection Systems (IDS)
  • Mostly host-based and not scalable to high-speed
    networks
  • Slammer worm infected 75,000 machines in lt10 mins
  • Host-based schemes inefficient and user dependent
  • Have to install IDS on all user machines !
  • Mostly simple signature-based
  • Cannot recognize unknown anomalies/intrusions
  • New viruses/worms, polymorphism

4
Current Intrusion Detection Systems (II)
  • Statistical detection
  • Unscalable for flow-level detection
  • IDS vulnerable to DoS attacks
  • Overall traffic based inaccurate, high false
    positives
  • Cannot differentiate malicious events with
    unintentional anomalies
  • Anomalies can be caused by network element faults
  • E.g., router misconfiguration, link failures, etc.

5
Network-based Intrusion Detection, Mitigation,
and Forensics System
  • Online traffic recording
  • SIGCOMM IMC 2004, INFOCOM 2006, ToN to appear
  • Reversible sketch for data streaming computation
  • Record millions of flows (GB traffic) in a few
    hundred KB
  • Small of memory access per packet
  • Scalable to large key space size (232 or 264)
  • Online sketch-based flow-level anomaly detection
  • IEEE ICDCS 2006 IEEE CGA, Security
    Visualization 06
  • Adaptively learn the traffic pattern changes
  • As a first step, detect TCP SYN flooding,
    horizontal and vertical scans even when mixed
  • Online stealthy spreader (botnet scan) detection
  • IWQoS 2007

6
Network-based Intrusion Detection, Mitigation,
and Forensics System (II)
  • Integrated approach for false positive reduction
  • Polymorphic worm signature generation detection
  • IEEE Symposium on Security and Privacy 2006
  • IEEE ICNP 2007 to appear
  • Accurate network diagnostics
  • ACM SIGCOMM 2006 IEEE INFOCOM 2007
  • Scalable distributed intrusion alert fusion w/
    DHT
  • SIGCOMM Workshop on Large Scale Attack Defense
    2006
  • Large-scale botnet event forensics using honeynet
  • work in progress

7
System Architecture
Remote aggregated sketch records
Streaming packet data
Part II Per-flow monitoring detection
8
System Deployment
  • Attached to a router/switch as a black box
  • Edge network detection particularly powerful

Monitor each port separately
Monitor aggregated traffic from all ports
Original configuration
9
Detecting Stealthy Spreaders Using Online
Outdegree Histograms
  • Yan Gao1, Yao zhao1, Robert Schweller1,
  • Shobha Venkataraman2, Yan Chen1,
  • Dawn Song2 and Ming-Yang Kao1

1. Northwestern University 2. Carnegie Mellon
University
10
Outline
  • Motivation
  • Problem definition
  • System design
  • Evaluation
  • Conclusion

11
Motivation
  • High-speed network monitoring
  • Small amount of memory usage
  • Small number of memory accesses per packet
  • Superspreaders vs. Stealthy spreaders
  • Superspreaders sources that connect a large
    number of distinct destinations
  • e.g. a compromised host doing fast scanning for
    worm propagation
  • Stealthy spreaders a number of sources that send
    more than a certain number of connections
    (unsuccessful) to distinct destinations
  • e.g. botnet scans or moderate worm propagation

12
Existing Data Streaming Algorithms
  • Online entropy estimation approaches
  • Chakrabarti et al. STACS 06 and Guha et al.
    ACM SODA 06
  • Pros detect unexpected changes in the network
    traffic
  • Cons lose some concrete distribution information
  • Online histogram estimation algorithms
  • Gibbons et al. VLDB 97 and Gilbert et al.
    STOC 02
  • Pros provide more information on the features of
    network traffic
  • Cons cannot record the number of unique items
  • Superspreader detection schemes
  • Venkataraman et al. NDSS 05 and Zhao et al.
    IMC 05
  • Pros detect sources with an very large outdegree
  • Cons memory usage unscalable to small/medium
    outdegrees such as bot scans
  • Superspreader detection is a special case of
    spreader detection

13
Outline
  • Motivation
  • Problem definition
  • System design
  • Evaluation
  • Conclusion

14
Problem Definitions
  • Two high-level problems
  • Construct an approximation of the outdegree
    histogram online
  • Directly detect the presence of stealthy
    spreaders without constructing the complete
    outdegree histogram

15
Problem Definition
  • Input stream of (Src, Dst) pairs S
  • Output

z --- of which powers define the buckets of the
histogram (z2)
16
Problem Definition
  • Input stream of (SIP, DIP) pairs S
  • Output

Wi --- the set of sources
Number of sources
A source s is in Wi if and only if the
number of unique destinations that s connects to
is in the range of zi, zi1)


20
21
22
23
24
25
26
27
Number of unique destinations
Histogram
17
Problem Definition
  • Input stream of (SIP, DIP) pairs S
  • Output

mi Wi Creating an approximate histogram is to
estimate mi for each bucket
Number of sources


20
21
22
23
24
25
26
27
Number of unique destinations
Histogram
18
Contribution
  • Study the problem of detecting stealthy spreaders
    online
  • With constant small memory
  • With small memory accesses per packet
  • Design the algorithm to detect stealthy spreaders
    online by approximating the outdegree histogram
  • Data recording phase
  • Sampling and coupon collection-based algorithms
  • Spreader detection phase
  • Linear regression to find bins where attacks
    happen
  • Show that the change of approximated histogram
    reveals the presence of anomalies

19
Outline
  • Motivation
  • Problem definition
  • System design
  • Evaluation
  • Conclusion

20
Recording Phase Sampling Algorithm
  • Fast update a smaller number of counters
  • per packet

2-3 h(src) 2-2
(src, dst)
Packet
Sampling algorithm
21
Recording PhaseCoupon Collecting Algorithm
  • Accurate create a better approximation
  • interim structure

uniform random hash function for hashing dst to
an integer in 1, 2i
2-3 h(src) 2-2
(src, dst)
Packet
Coupon collecting algorithm
22
Spreader Detection Phase
  • Outdegree histogram construction
  • Interim data structure -gt final outdegree
    histogram
  • Using linear programming method
  • Build a convex hull
  • Other constraints
  • Find the lower and upper bounds for mi
  • Solution
  • Directly use the interim data structure

Pros Obtain a reasonably accurate histogram
for normal network traffic Cons Fail to
accurately estimate the outdegree histogram for
anomalous traffic
23
System Design
  • Change detection
  • The change of the interim data structure of two
    time intervals
  • Stealthy spreader detection
  • ki gt ch (threshold)
  • System architecture

24
Spreader Detection Phase
  • The real scan event

One Peak
Number of scanners
Close to 0
Number of distinct destination
25
Spreader Detection Phase
  • Linear regression for coupon collecting algorithm
  • Mean squared error as the fitting metric

Value of counting
Bucket Example of linear regression
26
Outline
  • Motivation
  • Problem definition
  • System design
  • Evaluation
  • Conclusion

27
Evaluation Methodology
  • Traffic traces
  • OC-48 CAIDA data on Aug. 14th, 2002
  • The average packet rate 191K/s
  • The average flow rate 3.75K/s
  • A real scanning event collected from one class B
    honeynet on Jan 7th, 2007
  • Port 23
  • 2.5 hours
  • 1,607 unique sources
  • 1,700,236 scan sessions
  • Synthetic scanning traces

28
Simulation Results
  • Synthetic stealthy scan

False negative 0 The estimation error within
20 76.1
Attack intensity
False negative 17.8 The estimation error
within 20 33.9
Percentage of detection results
Estimate ratio
Estimate ratio The estimate ratio of scan
outdegree
29
Simulation Results
  • Synthetic stealthy scan

80
Cumulative percentage ()
35
Estimate ratio CDF of estimate ratio for spreader
intensity estimation
30
Simulation Results
  • Real stealthy scan

Estimation 90 Ground truth 87
Number of scanners
Number of distinct
destination The histogram of outdegree of
scanners collected in the honeynet
31
Simulation Results
  • Real stealthy scan

Mix the 5-min data of a real scanning event with
5-min normal traffic of CAIDA data (distribution
over 30 such intervals)
80
Cumulative percentage ()
Estimate ratio CDF of estimate ratios of scan
outdegree estimation
32
Online Performance
  • Memory consumption
  • Our method O(c log(m))
  • Constant memory 241KB 24KB
  • Superspreader
  • When k is small, the memory usage is closer to
    the size of the entire data stream N.
  • Memory access per packet
  • Single memory access per packet for each distinct
    counting structure
  • Speed up processing in parallel or in pipeline
  • Speed
  • 3.2GHz Pentium 4 computer
  • Recording 200 seconds for each 5-min CAIDA data
    interval
  • Detection less than 0.1 second

33
Conclusion
  • Propose the stealthy spreader detection problem
  • Design an online outdegree histogram based
    stealthy spreader detection algorithm
  • Propose two randomized algorithms for recording
    phase
  • Propose the linear regression based approach for
    stealthy spreader detection

34
  • ? ? ?
Write a Comment
User Comments (0)
About PowerShow.com