Worm%20Origin%20Identification%20Using%20Random%20Moonwalks - PowerPoint PPT Presentation

About This Presentation
Title:

Worm%20Origin%20Identification%20Using%20Random%20Moonwalks

Description:

Random Moonwalk Algo. ... Random Moonwalk Algo. Basic Algorithm. Go backward from every node for certain distance. ... Random Moonwalk Algo. (cont'd) Why this ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 36
Provided by: csNorth
Category:

less

Transcript and Presenter's Notes

Title: Worm%20Origin%20Identification%20Using%20Random%20Moonwalks


1
Worm Origin Identification Using Random Moonwalks
  • Yinglian Xie, V. Sekar, D. A. Maltz, M. K.
    Reiter, Hui Zhang
  • 2005 IEEE Symposium on Security and Privacy

Presented by Anup Goyal Edward Merchant
2
Outline
  • Motivation/Introduction
  • Problem Formulation
  • The Random Moonwalk Algorithm
  • Evaluation Methodology
  • Analytical Model
  • Real Trace Study
  • Simulation Study
  • Deployment and Future Work

3
Outline
  • Motivation/Introduction
  • Problem Formulation
  • The Random Moonwalk Algorithm
  • Evaluation Methodology
  • Analytical Model
  • Real Trace Study
  • Simulation Study
  • Deployment and Future Work

4
Motivation
  • Little automated support for identifying the
    location from which an attack is launched.
  • Knowledge of the origin support law enforcement.
  • Knowledge of the casual flow that advance attack
    supports diagnosis of how network defense is
    breached.

5
Introduction
  • We craft an algorithm that determines the origin
    of epidemic spreading attacks.
  • identify the patient zero of the epidemic
  • reconstruct the sequence of spreading

6
Introduction (contd)
  • Random moonwalk algorithm - Find the origin and
    propagation paths of a worm attack.
  • performs post-mortem analysis on the traffic
    records logged by the network.
  • It depends on the assumption that worm
    propagation occurs in a tree-like structure.

7
Outline
  • Introduction
  • Problem Formulation
  • The Random Moonwalk Algorithm
  • Evaluation Methodology
  • Analytical Model
  • Real Trace Study
  • Simulation Study
  • Deployment and Future Work

8
Problem Formulation
9
Problem Formulation (contd)
  • A directed host contact graph G (V, E)
  • V H T
  • H is the set of all hosts in the network
  • T is time
  • Each directed edge represents a network flow
    between two end hosts at certain time.
  • flow has a finite duration, and involves transfer
    of one or more packets.
  • e (u, v, ts, te)

10
Problem Formulation (contd)
  • normal edge
  • The flow does not carry an infectious payload.
  • attack edge
  • The flow carries attack traffic, whether or not
    the flow is successful.
  • causal edge
  • The flow that actually infect its destination.
  • Goal - Identify a set of edges that are edges
    from the top level of the casual tree.

11
Outline
  • Introduction
  • Problem Formulation
  • The Random Moonwalk Algorithm
  • Evaluation Methodology
  • Analytical Model
  • Real Trace Study
  • Simulation Study
  • Deployment and Future Work

12
Random Moonwalk Algo.
  • Causal relationship between flows by exploiting
    the global structure of worm attacks
  • No use of attack content, attack packet size, or
    port numbers
  • For attack progress, there has to be a
    communication link between source of the attack
    and compromised nodes
  • This infection causing communication flows form a
    causal tree, rooted at the source of attack.
  • Find the tree and root is the source of attack
  • Find causal flows and attack flows

13
Random Moonwalk Algo.
  • Basic Algorithm
  • Go backward from every node for certain distance.
  • At each node choose only the flows which are
    within certain time limit
  • Do it Z number of times
  • Find the edges with highest frequency
  • Create a tree for these flows
  • Most probably this is the causal tree and root is
    the source of attack

14
Random Moonwalk Algo. (contd)
  • Sampling process controlled by three parameters
  • W the number of walks (samples) performed.
  • D maximum length of the path traversed.
  • ?t - sampling window size, max. time allowed
    between two consecutive edges

15
Random Moonwalk Algo. (contd)
  • Why this algorithm works ?
  • To propagate, sometime after infection, worm
    creates a new flows to other hosts.
  • This forms a link from source to last victim
  • Traverse this link backward and find the source
  • An infected host generally originates more flows
    than it receives.
  • The originators host contact graph are mostly
    clients. Normal edges have no predecessor within
    ?t.

16
Outline
  • Introduction
  • Problem Formulation
  • The Random Moonwalk Algorithm
  • Evaluation Methodology
  • Analytical Model
  • Real Trace Study
  • Simulation Study
  • Deployment and Future Work

17
Outline
  • Evaluation Methodology
  • Analytical Model
  • Assumptions
  • Edge Probability Distribution
  • False Positives and False Negatives
  • Parameter Selection
  • Real Trace Study
  • Simulation Study

18
Analytical Model (Assumptions)
  • The host contact graph is known.
  • E edges and H hosts
  • Discretize time into units. Every flow has a
    length of one unit and fits into one unit.

19
Analytical Model (Probability)
20
Analytical Model (FP FN)
(42 malicious edges at k 1.)
(Total 105 host.)
21
Outline
  • Evaluation Methodology
  • Analytical Model
  • Real Trace Study
  • Detect the Existence of an Attack
  • Identify Casual Edges Initial Infected Host
  • Reconstruct the Top Level Casual Tree
  • Parameter Selection
  • Performance
  • Simulation Study

22
Real Trace Study
  • Background Traffic
  • Traffic trace was collected over a 4 hour period
    at backbone of a class-B university network.
  • collect intra-campus flows only (1.4 million)
    involving 8040 hosts
  • Addition
  • Add flow records to represent worm-like traffic
    with vary scanning rate
  • randomly select the vulnerable hosts.

23
Real Trace Study (Existence)
24
Real Trace Study (Identify)
(800 causal edges from 1.5106 flows) (The
scanning rate of Trace-50 is less than Trace-10.)
25
Real Trace Study (Identify)
  • Top frequent sampling v.s. Actual initial edges

(total 800 causal edges, initial 10 are the
first 80 edges) (The scanning rate of Teace-50 is
less than Trace-10.)
26
Top 60, Trace-50, 104 walks
Original Attacker
Blaster Worm scan
27
Real Trace Study (Parameter)
  • d and ?t

28
Real Trace Study (Performance)
  • Random moonwalk
  • Z 100, 104 walks
  • Heavy-hitter
  • Find 800 hosts with largest number of flows in
    the trace, random pick 100 flows
  • Super-spreader
  • Find 800 hosts contacted the largest number of
    destination, randomly pick 100 flows
  • Oracle
  • With zero false positive rate, randomly select
    100 flows between infected hosts

29
Real Trace Study (Performance)
30
Real Trace Study (Performance)
  • Scanning Method
  • Smart worm (always scan valid hosts), R?
  • Scan with random address

C casual edge A attack edge 100 Z100 500
Z500
31
Outline
  • Evaluation Methodology
  • Analytical Model
  • Real Trace Study
  • Simulation Study

32
Simulation Study
  • Simulate different background traffic
  • Realistic host contact graphs tend to be much
    sparser, meaning the chance of communication
    between two arbitrary hosts is very low.

p.s. in campus network,the accuracy is about 0.7
33
Outline
  • Introduction
  • Problem Formulation
  • The Random Moonwalk Algorithm
  • Evaluation Methodology
  • Analytical Model
  • Real Trace Study
  • Simulation Study
  • Deployment and Future Work

34
Deployment and Future Work
  • This approach assumes that the availability of
    complete data.
  • the missing data on performance
  • the deployment of the algorithm

35
Questions ????
  • Thank You ?
Write a Comment
User Comments (0)
About PowerShow.com