Title: Detectability of Traffic Anomalies in Two Adjacent Networks
1(No Transcript)
2Detectability of Traffic Anomalies in Two
Adjacent Networks
Augustin Soule, Haakon Ringberg, Fernando
Silveira, Jennifer Rexford, Christophe Diot
3Anomaly detection in large networks
- Anomaly detection is complex for large network
- Network-wide analysis Lakhina 04 is promising
- Validated against multiple networks at different
time - Abilene 03, Geant 04, Sprint Europe 03
- Features impacting the anomaly detection are
unknown yet - Compare the anomaly observed between two networks
4Using entropy for anomaly detection
- Hypothesis the distribution changes during an
anomaly - Entropy is a measure of the dispersion of the
distribution - Minimum if the distribution is concentrated
- Maximum if the distribution is spread
- Four features
- Source IP distribution
- Destination IP distribution
- Source Port distribution
- Destination port distribution
Normal
During a DOS attack
5Detecting anomalies
- Kalman filter method Soule 05
- Method Overview
- Use a model to predict the traffic
- Innovation Prediction error
- High threshold avoid false positive
6Collected dataset
- Abilene and Geant monitoring
- Collected three month of data
- BGP
- IS-IS
- NetFlow
- Isolate twenty consecutive days of complete
measurement - Connected through two peering links
Sampling Temporal aggregation Anonymization
Abilene 1/100 5 min 11 bits
Geant 1/1000 15 min 0 bits
7Abilene and Geant
- Use routing information to isolate
- Traffic from Abilene to Geant
- Traffic from Geant to Abilene
- Detect anomalies inside each dataset using the
same threshold parameter, but different
data-reduction parametes
8Anomalies detected
- Compare the anomalies sent versus the anomalies
observed - Expected for G2A and A
- Surprising for G and A2G
- Amount of traffic ?
- Sampling ?
- Anonymization ?
- Threshold ?
- Method ?
- Model ?
58 anomalies
78 anomalies
14 anomalies
10 anomalies
9Undetected anomalies
- Examples of anomalies detected in a network but
undetected in the other. - Impact of Sampling Method
- Impact of customers Traffic Mix
- Impact of anonymization
10Example 1 attack over Port 22
Sampling affects the perception of anomaly The
effect depends on the type of anomaly
11Example 2 Alpha Flow
Destination IP entropy
- Large file transfer between two hosts
- Observed in Geant
- Undetectable in Abilene
- In this Abilene the traffic is already
concentrated by Web traffic - The anomaly detectability is impacted by traffic
Geant
Abilene
12Example 3 Scan over an IP subnet
- Attacker doing a subnet scan
- One source host
- Multiple destination hosts
- Concentration of source IP
- Dispersion of destination IP
- But we observe concentration in the Destination
IP entropy - Anonymization can
- Help to detect anomalies
- Impact the anomaly identification
13Summary
- First synchronized observation of two networks
for anomaly detection - Identification of various features impacting
anomaly detection - Sampling
- Traffic mix
- Anonymization
- Two anomalies are impacted differently by each
features - What impacts detectability ?
14Thanks for listening !
15backup
16Collecting data from two networks
- GEANT and Abilene connected through two peering
links (nov. 05) - 20 consecutive days of traffic and routing data
from GEANT and Abilene - Using a Kalman filter on the entropy of the
distribution of IP and ports - Entropy increase gt spread of the distribution
- Entropy decrease gt concentration in the
distribution
17What impacts the detection of anomalies ?
- Objective Identify important elements that
influence anomaly detection - Understand the source of false negatives
- Idea observe the same anomalies on different
networks using the same method parameters
18Previous work
- Multiple methods detecting statistical traffic
anomalies - Anomaly a priori
- Input data
- Complexity
- Model type
No a priori Netflow records Low
complexity Network-wide diagnosis
19Impact of the anomymization (contd)
20Power of Network-Wide Analysis
- Introduced by Lakhina 04 and Soule 05
- Learn and use natural correlation between links
to model expected behavior - Pros
- Accurately detects small or distributed events
- False positive rate typically 10
- This paper
- Understand why some anomalies are not detected
- Still to be done
- Automatic method calibration
21Power of Network-Wide Analysis
Attack found by methods hard to manually
isolate
Attacker
Traffic ( of bytes)
Peak rate 300Mbps Attack rate 19Mbps/flow
Time
Figures from Lakhina 05
Learn and use the existing correlations between
flows
22Example 3 Scan over an IP subnet
- Attacker doing a subnet scan
- One source host
- Multiple destination hosts
- Concentration of source IP
- Dispersion of destination IP
- But we observe concentration in the Destination
IP entropy - Anonymization can
- Help to detect anomalies
- Impact the anomaly identification
Source IP entropy
Destination IP entropy
23Two neighbor observation
- Differences can be explained by
- Network Operation Center
- Sampling rate
- Anonymization
- Anomaly detection threshold
- Customers traffic
- Traffic mix