Automatically Inferring Patterns of Resource Consumption in Network Traffic

About This Presentation

Title:

Automatically Inferring Patterns of Resource Consumption in Network Traffic

Description:

Automatically Inferring Patterns of Resource Consumption in Network Traffic ... University of California, San Diego. Traffic Clusters - 2003. 2. Who is using my link? ... –

Number of Views:16

Avg rating:3.0/5.0

Slides: 32

Provided by: cristia45

Learn more at: https://pages.cs.wisc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Automatically Inferring Patterns of Resource Consumption in Network Traffic

1
Automatically Inferring Patterns of Resource
Consumption in Network Traffic

Cristian Estan, Stefan Savage, George Varghese
University of California, San Diego

2
Who is using my link?
3
Looking at the traffic
Too much data for a human
Do something smarter!
4
Looking at traffic aggregates
Rank Destination IP Traffic
1 jeff.dorm.bigU.edu 11.9
2 tracy.dorm.bigU.edu 3.12
3 risc.cs.bigU.edu 2.83

Aggregating on individual packet header fields
gives useful results but
Traffic reports are not always at the right
granularity (e.g. individual IP address, subnet,
etc.)
Cannot show aggregates defined over multiple
fields (e.g. which network uses which
application)
The traffic analysis tool should automatically
find aggregates over the right fields at the
right granularity

Which network uses web and which one kazaa?
Rank Source port Traffic
1 Web 42.1
2 Kazaa 6.7
3 Ssh 6.3
Rank Destination network Traffic
1 library.bigU.edu 27.5
2 cs.bigU.edu 18.1
3 dorm.bigU.edu 17.8
Where does the traffic come from?
What apps are used?
Most traffic goes to the dorms
5
Ideal traffic report
Traffic aggregate Traffic
Web traffic 42.1
Web traffic to library.bigU.edu 26.7
Web traffic from www.schwarzenegger.com 13.4
ICMP traffic from sloppynet.badU.edu to jeff.dorm.bigU.edu 11.9
Web is the dominant application
This is a Denial of Service attack !!
The library is a heavy user of web
Thats a big flash crowd!
This paper is about giving the network
administrator insightful traffic reports
6
Contributions of this paper

Approach
Definitions
Algorithms
System
Experience

7
Approach

Characterize traffic mix by describing all
important traffic aggregates
Multidimensional aggregates (e.g. flash crowd
described by protocol, port number and IP
address)
Aggregates at the the right level of granularity
(e.g. computer, subnet, ISP)
Traffic analysis is automated finds insightful
data without human guidance

8
Definition traffic clusters

Traffic clusters are the multidimensional traffic
aggregates identified by our reports
A cluster is defined by a range for each field
The ranges are from natural hierarchies (e.g. IP
prefix hierarchy) meaningful aggregates
Example
Traffic aggregate incoming web traffic for CS
Dept.
Traffic cluster ( SrcIP, DestIP in
132.239.64.0/21, ProtoTCP, SrcPort80, DestPort
in 1024,65535 )

9
Definition traffic report

Traffic reports give the volume of chosen traffic
clusters
To keep report size manageable describe only
clusters above threshold (e.g. Htotal of
traffic/20)
To avoid redundant data compress by omitting
clusters whose traffic can be inferred (up to
error H) from non-overlapping more specific
clusters in the report
To highlight non-obvious aggregates prioritize by
using unexpectedness label
Example
50 of all traffic is web
Prefix B receives 20 of all traffic
The web traffic received by prefix B is 15
instead of 502010, unexpectedness label is
15/10150

10
Contributions of this paper

Approach
Definitions
Algorithms
System
Experience

11
Algorithms and theory

Algorithms and theoretical bounds in the paper
Unidimensional reports are easy to compute
Multidimensional reports are exponentially harder
as we add more fields
Next few slides
Example of unidimensional compression
Example for the structure of the multidimensional
cluster space

12
Unidimensional report example
Hierarchy
Threshold100
10.0.0.12/30
10.0.0.14/31
40
35
15
35
30
160
110
75
10.0.0.2
10.0.0.3
10.0.0.4
10.0.0.5
10.0.0.8
10.0.0.9
10.0.0.10
10.0.0.14
13
Unidimensional report example
Compression
Source IP Traffic
10.0.0.0/29 120
10.0.0.8/29 380
10.0.0.8 160
10.0.0.9 110
120
380
380-270100
10.0.0.0/29
10.0.0.8/29
305-270lt100
160
110
10.0.0.8
10.0.0.9
14
Multidimensional structure ex.
Nodes (clusters) have multiple parents
Nodes (clusters) overlap
US
CA
Web
15
Contributions of this paper

Approach
Definitions
Algorithms
System
Experience

16
System AutoFocus
Cluster miner
Web based GUI
Grapher
Traffic parser
Packet header trace
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
Contributions of this paper

Approach
Definitions
Algorithms
System
Experience

21
Structure of regular traffic mix

Backups from CAIDA to tape server
Semi-regular time pattern
FTP from SLAC Stanford
Scripps web traffic
Web Squid servers
Large ssh traffic
Steady ICMP probing from CAIDA

SD-NAP
SD-NAP
22
Analysis of unusual events

UCSD to UCLA route change
Sapphire/SQL Slammer worm

Site 2
23
Conclusions
10101111010100001010111111010110010101011010110100
00101010100101010111101010101000101111010000010111
11110101100101011101011110010010101010001101111110
00101011101101011001010101101011110000101010111101
11010111010101010111111010110010101011010101111101
01000011010000101101010010101100100000010101100101
01010111110000100010000101010111101010000101110010
10101101011110000010101011111101011000101111010000
01011111010101101011110010010101011001010101000101
01001010101101010100101110010100000101000011101101
01010110111111000101011101011101011001010101101011
11000011011110111010111010101010111111010110010101
01101011110111010100001101010100101011010101110101
01001010000101011010101001010100000101010101010101
10101110101010000001010101010110101010101111010111
01010110101000110001010100101110101010011010101000
01000110101111010100010110
24
Conclusions

Multidimensional traffic clusters using natural
hierarchies describe traffic aggregates
Traffic reports using thresholding identify
automatically conspicuous resource consumption at
the right granularity
Compression produces compact traffic reports and
unexpectedness labels highlight non-obvious
aggregates
Our prototype system, AutoFocus, provides
insights into the structure of regular traffic
and unexpected events

25
Thank you!

Alpha version of AutoFocus downloadable from
http//ial.ucsd.edu/AutoFocus/
Any questions?
Acknowledgements NIST, NSF, Vern Paxson, David
Moore, Liliana Estan, Jennifer Rexford, Alex
Snoeren, Geoff Voelker

26
Bounds and running times
Report size Running time Memory usage
unc. 1dim. rep. 1(d-1)T/H O(nm(d-1)) O(m(d-1))
1dim. report T/H linear linear
1dim. ? report T1/HT2/H linear
unc. dim. rep. T/H ?di resultn O(mresult)
dim. rep. T/H ?di/max(di)
dim. ? report eresult
27
Open questions

Are there tighter bounds for the size of the
reports?
Are there algorithms that produce smaller
results?
Are there algorithms that compute traffic reports
more efficiently? In streaming fashion?

28
Delta reports

Why repeat the same traffic report if the traffic
doesnt change from one day to the other?
Delta reports describe the clusters that
increased or decreased by more than the threshold
from one interval to the other
On related traffic mixes delta reports much
smaller than traffic reports
Multidimensional compression very hard for delta
reports
We have only exponential algorithm for the
cluster delta

29
Greedy compression algorithm
30
Multidimensional report example
Thresholding
Compression
31
System details
Part Language LoC Status
Backend C 5400 stable
GUI HTML, Javascript 1000 functional
Glue perl 350 evolving

Write a Comment

User Comments (0)