Title: BGP-lens: Patterns and Anomalies in Internet Routing Updates
1BGP-lens Patterns and Anomalies in Internet
Routing Updates
- B. Aditya Prakash1, Nicholas Valler2,
- David Andersen1, Michalis Faloutsos2, Christos
Faloutsos1 - 1Carnegie Mellon University
- 2UC-Riverside
- KDD 2009, Paris
2Introduction
Each Row is an update
- Border Gateway Protocol (BGP)
- Internet Routing Protocol
- Router sending messages to each other
- Keeps path information up-to-date
- Ideal Setting - no BGP updates
- Really many updates
- link failures, router restarts, malicious
behavior
Time peerAS originAS prefix
2005-02-17 123942 ATT SPRINT 204.29.119.0/24
2005-02-17 123943 VERIZON AOL 204.29.80.0/24
2005-02-17 123946 WASH ATLA 204.29.79.0/24
. . . .
3Introduction contd.
- Question Find patterns/anomalies?
- Challenges
- Millions of updates sent over network
- Data has multiple dimensions
- Noisy Measurements
- Impossible for human to sift through updates
Automated Tool needed!
4The Data
Time peerAS originAS prefix
2005-02-17 123942 ATT SPRINT 204.29.119.0/24
2005-02-17 123943 VERIZON AOL 204.29.80.0/24
2005-02-17 123946 WASH ATLA 204.29.79.0/24
. . . .
- Data from Datapository.net
- Abilene Network
18 million update messages over two years!
5Our Approach
- Look at a simple time-series
- Focus on just the time
- of updates received every b seconds (bin size)
- Specific Problem we are tackling
- Given such time-series
- Report patterns and anomalies
- Also find suspicious entities (paths,
- ASes etc.)
Time
2005-02-17 123942
2005-02-17 123943
2005-02-17 123946
2005-02-17 124001
.
Time peerAS originAS prefix
2005-02-17 123942 ATT SPRINT 204.29.119.0/24
2005-02-17 123943 VERIZON AOL 204.29.80.0/24
2005-02-17 123946 WASH ATLA 204.29.79.0/24
. . . .
time
b secs
2
Bin 0
1
Count 4
2
6
6Real data Washington Router
Bin Size 600s
Very Bursty!
of Updates
Traditional Tools like FFT, auto-regression dont
work ?
Bin number (Time)
7Outline
- Introduction and Problem Statement
- Techniques
- Temporal Analysis
- Frequency Analysis
- BGP-lens at work
- Conclusions
8Temporal Analysis
Bin size 10s
- First Cut Take log-linear plot
- emphasizes small values over high values
9But Bin size is important!
10Bin size 600s
Clotheslines
11Clotheslines
- Q1 Why Clotheslines?
- Near consecutive updates
- over long time-period
- Can be Route Flapping
- advertise/withdraw same path frequently
- important to identify
- Q2 How to automate this discovery?
12Proposal Marginals to Rescue
- PDF of volume of updates
- Number of time-bins with volume
Extremes Height of the clotheslines!
13Marginals to Rescue
- PDF of volume of updates
- Number of time-bins with volume
14Algorithm - Clotheslines
Details!
- For marginals plot use the median filtering
approach to determine outliers - For each time interval found, report the most
consistent IPs/ASes etc.
High Level Idea only details in paper!
15Outline
- Introduction and Problem Statement
- Techniques
- Temporal Analysis
- Frequency Analysis
- BGP-lens at work
- Conclusions
16Low Freq.
High energy
Low energy
Tornado does not touch down
High Freq.
time -gt
Signal
17In real data
18E2
20,000 updates!
19Why Prolonged Spike?
- Bursts of short duration
- Can represent malicious behavior
- Or simple router restarts!
- Exact cause hard to find but important for
system-administrators
20Algorithm Prolonged Spikes
Details!
- Basic idea find tornados from scalogram
- Find suitable starting point at higher levels
- Extend downward as much as possible
- The finest scale where tornado stops
- the shortest time period to look for a prolonged
spike - Again, details in paper!
21 Scalability
?
22BGP-lens User Interface
optional
of suspicious events sysadmin wants to check
duration length of events to be checked (think
daily vs weekly vs monthly)
23Outline
- Introduction and Problem Statement
- Techniques
- Temporal Analysis
- Frequency Analysis
- BGP-lens at work
- Conclusions
24BGP-lens at Work
- We found real events too ?. examples-
- Event 1
- 50-clothesline
- Prefix and Origin-AS pointed to Alabama
Supercomputing Net - When contacted sysadmins
- attributed changes to route flapping
- the route for 207.157.115.0/24 was appearing and
disappearing in the IGP routing table ...
which may have caused BGP to flap. - Anomaly went undetected and unresolved for 30
days!
25Results from real data
- Event 2
- Prolonged Spike
- May 12th 2006 8hr spike
- Most persistent IPs/ASes
- Primary and middle schools in a large district in
a country - Two more spikes Jan18-19, 2006 and Aug 1
26Conclusions
- Studied huge real data (18 million updates)
- Developed two new techniques
- effective
- spots subtle phenomena like clotheslines and
prolonged spikes - scalable
- BGP-lens a user-friendly tool
- provides reasonable defaults
- provides easy-to-use knobs
- leads like IPs/ASes
27Thank You!
- Any questions?
- www.cs.cmu.edu/badityap
- We thank NSF, USA for their support.
- Author-Reel!
28Extra - Frequency Analysis
- Data is self-similar!
- we used the entropy-plot measure
- also called the b-model 26
- Corresponds to
- b-model of 75-25
- Multi-resolution
- techniques needed!
29Extra - FFT
30Extra Marginals for 10sec
31Extra Prolonged Spike Algorithm