Automatically Inferring Patterns of Resource Consumption in Network Traffic

About This Presentation
Title:

Automatically Inferring Patterns of Resource Consumption in Network Traffic

Description:

Automatically Inferring Patterns of Resource Consumption in Network Traffic ... University of California, San Diego. Traffic Clusters - 2003. 2. Who is using my link? ... –

Number of Views:16
Avg rating:3.0/5.0
Slides: 32
Provided by: cristia45
Category:

less

Transcript and Presenter's Notes

Title: Automatically Inferring Patterns of Resource Consumption in Network Traffic


1
Automatically Inferring Patterns of Resource
Consumption in Network Traffic
  • Cristian Estan, Stefan Savage, George Varghese
  • University of California, San Diego

2
Who is using my link?
3
Looking at the traffic
Too much data for a human
Do something smarter!
4
Looking at traffic aggregates
Rank Destination IP Traffic
1 jeff.dorm.bigU.edu 11.9
2 tracy.dorm.bigU.edu 3.12
3 risc.cs.bigU.edu 2.83
  • Aggregating on individual packet header fields
    gives useful results but
  • Traffic reports are not always at the right
    granularity (e.g. individual IP address, subnet,
    etc.)
  • Cannot show aggregates defined over multiple
    fields (e.g. which network uses which
    application)
  • The traffic analysis tool should automatically
    find aggregates over the right fields at the
    right granularity

Which network uses web and which one kazaa?
Rank Source port Traffic
1 Web 42.1
2 Kazaa 6.7
3 Ssh 6.3
Rank Destination network Traffic
1 library.bigU.edu 27.5
2 cs.bigU.edu 18.1
3 dorm.bigU.edu 17.8
Where does the traffic come from?
What apps are used?
Most traffic goes to the dorms
5
Ideal traffic report
Traffic aggregate Traffic
Web traffic 42.1
Web traffic to library.bigU.edu 26.7
Web traffic from www.schwarzenegger.com 13.4
ICMP traffic from sloppynet.badU.edu to jeff.dorm.bigU.edu 11.9
Web is the dominant application
This is a Denial of Service attack !!
The library is a heavy user of web
Thats a big flash crowd!
This paper is about giving the network
administrator insightful traffic reports
6
Contributions of this paper
  • Approach
  • Definitions
  • Algorithms
  • System
  • Experience

7
Approach
  • Characterize traffic mix by describing all
    important traffic aggregates
  • Multidimensional aggregates (e.g. flash crowd
    described by protocol, port number and IP
    address)
  • Aggregates at the the right level of granularity
    (e.g. computer, subnet, ISP)
  • Traffic analysis is automated finds insightful
    data without human guidance

8
Definition traffic clusters
  • Traffic clusters are the multidimensional traffic
    aggregates identified by our reports
  • A cluster is defined by a range for each field
  • The ranges are from natural hierarchies (e.g. IP
    prefix hierarchy) meaningful aggregates
  • Example
  • Traffic aggregate incoming web traffic for CS
    Dept.
  • Traffic cluster ( SrcIP, DestIP in
    132.239.64.0/21, ProtoTCP, SrcPort80, DestPort
    in 1024,65535 )

9
Definition traffic report
  • Traffic reports give the volume of chosen traffic
    clusters
  • To keep report size manageable describe only
    clusters above threshold (e.g. Htotal of
    traffic/20)
  • To avoid redundant data compress by omitting
    clusters whose traffic can be inferred (up to
    error H) from non-overlapping more specific
    clusters in the report
  • To highlight non-obvious aggregates prioritize by
    using unexpectedness label
  • Example
  • 50 of all traffic is web
  • Prefix B receives 20 of all traffic
  • The web traffic received by prefix B is 15
    instead of 502010, unexpectedness label is
    15/10150

10
Contributions of this paper
  • Approach
  • Definitions
  • Algorithms
  • System
  • Experience

11
Algorithms and theory
  • Algorithms and theoretical bounds in the paper
  • Unidimensional reports are easy to compute
  • Multidimensional reports are exponentially harder
    as we add more fields
  • Next few slides
  • Example of unidimensional compression
  • Example for the structure of the multidimensional
    cluster space

12
Unidimensional report example
Hierarchy
Threshold100
10.0.0.12/30
10.0.0.14/31
40
35
15
35
30
160
110
75
10.0.0.2
10.0.0.3
10.0.0.4
10.0.0.5
10.0.0.8
10.0.0.9
10.0.0.10
10.0.0.14
13
Unidimensional report example
Compression
Source IP Traffic
10.0.0.0/29 120
10.0.0.8/29 380
10.0.0.8 160
10.0.0.9 110
120
380
380-270100
10.0.0.0/29
10.0.0.8/29
305-270lt100
160
110
10.0.0.8
10.0.0.9
14
Multidimensional structure ex.
Nodes (clusters) have multiple parents
Nodes (clusters) overlap
US
CA
Web
15
Contributions of this paper
  • Approach
  • Definitions
  • Algorithms
  • System
  • Experience

16
System AutoFocus
Cluster miner
Web based GUI
Grapher
Traffic parser
Packet header trace
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
Contributions of this paper
  • Approach
  • Definitions
  • Algorithms
  • System
  • Experience

21
Structure of regular traffic mix
  • Backups from CAIDA to tape server
  • Semi-regular time pattern
  • FTP from SLAC Stanford
  • Scripps web traffic
  • Web Squid servers
  • Large ssh traffic
  • Steady ICMP probing from CAIDA

SD-NAP
SD-NAP
22
Analysis of unusual events
  • UCSD to UCLA route change
  • Sapphire/SQL Slammer worm

Site 2
23
Conclusions
10101111010100001010111111010110010101011010110100
00101010100101010111101010101000101111010000010111
11110101100101011101011110010010101010001101111110
00101011101101011001010101101011110000101010111101
11010111010101010111111010110010101011010101111101
01000011010000101101010010101100100000010101100101
01010111110000100010000101010111101010000101110010
10101101011110000010101011111101011000101111010000
01011111010101101011110010010101011001010101000101
01001010101101010100101110010100000101000011101101
01010110111111000101011101011101011001010101101011
11000011011110111010111010101010111111010110010101
01101011110111010100001101010100101011010101110101
01001010000101011010101001010100000101010101010101
10101110101010000001010101010110101010101111010111
01010110101000110001010100101110101010011010101000
01000110101111010100010110
24
Conclusions
  • Multidimensional traffic clusters using natural
    hierarchies describe traffic aggregates
  • Traffic reports using thresholding identify
    automatically conspicuous resource consumption at
    the right granularity
  • Compression produces compact traffic reports and
    unexpectedness labels highlight non-obvious
    aggregates
  • Our prototype system, AutoFocus, provides
    insights into the structure of regular traffic
    and unexpected events

25
Thank you!
  • Alpha version of AutoFocus downloadable from
  • http//ial.ucsd.edu/AutoFocus/
  • Any questions?
  • Acknowledgements NIST, NSF, Vern Paxson, David
    Moore, Liliana Estan, Jennifer Rexford, Alex
    Snoeren, Geoff Voelker

26
Bounds and running times
Report size Running time Memory usage
unc. 1dim. rep. 1(d-1)T/H O(nm(d-1)) O(m(d-1))
1dim. report T/H linear linear
1dim. ? report T1/HT2/H linear
unc. dim. rep. T/H ?di resultn O(mresult)
dim. rep. T/H ?di/max(di)
dim. ? report eresult
27
Open questions
  • Are there tighter bounds for the size of the
    reports?
  • Are there algorithms that produce smaller
    results?
  • Are there algorithms that compute traffic reports
    more efficiently? In streaming fashion?

28
Delta reports
  • Why repeat the same traffic report if the traffic
    doesnt change from one day to the other?
  • Delta reports describe the clusters that
    increased or decreased by more than the threshold
    from one interval to the other
  • On related traffic mixes delta reports much
    smaller than traffic reports
  • Multidimensional compression very hard for delta
    reports
  • We have only exponential algorithm for the
    cluster delta

29
Greedy compression algorithm
30
Multidimensional report example
Thresholding
Compression
31
System details
Part Language LoC Status
Backend C 5400 stable
GUI HTML, Javascript 1000 functional
Glue perl 350 evolving
Write a Comment
User Comments (0)
About PowerShow.com