Cristian Estan, George Varghese - PowerPoint PPT Presentation

About This Presentation
Title:

Cristian Estan, George Varghese

Description:

Traffic reports. 20% is Kazaa from Steve's PC. 50% is Kazaa from ... Specific open problem: computing traffic cluster reports in streaming fashion. June 8, 2003 ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 31
Provided by: cristia45
Category:

less

Transcript and Presenter's Notes

Title: Cristian Estan, George Varghese


1
Data Streaming in Computer Networking
  • Cristian Estan, George Varghese
  • University of California, San Diego

2
Talk structure
  • Traditional streaming in networking
  • Rules of the game
  • Iteration paradigm packet scheduling example
  • New streaming problems
  • Detecting malicious traffic
  • Understanding network workloads

3
Internet service model
Destination IP address
Source IP address
Destination port
Source port
Data
Header
Flow
Internet
4
Traditional router functions
IP Lookup
?
Incoming 1
Outgoing 1
Incoming 2
Outgoing 2
Incoming 3
Outgoing 3
5
Traditional router functions
IP Lookup
Out2
Incoming 1
Outgoing 1
Incoming 2
Outgoing 2
Incoming 3
Outgoing 3
6
Traditional router functions
Switching
Out2
Out3
Incoming 1
Outgoing 1
Out3
Incoming 2
Outgoing 2
Out1
Out2
Incoming 3
Outgoing 3
7
Traditional router functions
Scheduling
Incoming 1
Outgoing 1
Flow 1
Flow 2
Incoming 2
Outgoing 2
Flow 3
Incoming 3
Outgoing 3
8
Traditional router functions
Scheduling
Incoming 1
Outgoing 1
Flow 1
Flow 3
Flow 2
Incoming 2
Outgoing 2
Incoming 3
Outgoing 3
9
Rules of the game
  • Wire speed processing
  • At 40 gigabits/s 8 nanoseconds per packet - need
    fast SRAM
  • Limited SRAM (say 32 megabits) but millions of
    flows
  • What does this mean for algorithms?
  • Low worst case complexity bounds
  • Low bounds on the amount of memory used
  • Differences from databases
  • One pass vs. multiple passes
  • Worst case vs. average case
  • Small constants vs. asymptotic complexity

10
Talk structure
  • Traditional streaming in networking
  • Rules of the game
  • Iteration paradigm packet scheduling example
  • New streaming problems
  • Detecting malicious traffic
  • Understanding network workloads

11
Iteration paradigm
  • Many networking algorithms use iteration in time
  • Way to allow multi-pass algorithms without
    storing input by assuming inputs do not change
    quickly
  • Many examples (MULTOPS for DoS detection Gil01,
    CSFQ for scheduling Stoica98)
  • Would be nice to formalize tradeoff between
    quality of results and drift rate of input

12
Example Core Stateless FQ
R
R
If RgtF drop with probability 1-F/R Iteratively
compute fair share F
R
Mark rate R
13
Talk structure
  • Traditional streaming in networking
  • Rules of the game
  • Iteration paradigm packet scheduling example
  • New streaming problems
  • Detecting malicious traffic
  • Understanding network workloads

14
New streaming problems
  • Detecting malicious activity
  • Flooding (denial of service attacks)
  • Worms
  • Scans looking for vulnerable servers
  • Understanding workloads
  • Billing
  • Planning network growth
  • Application mix

15
Detecting malicious traffic
  • Well defined building blocks
  • Detecting large aggregates
  • Similar to iceberg queries
  • Counting active flows in an aggregate
  • Similar to counting distinct values
  • Many open problems e.g. detect worms and DoS
    attacks (not clear what is right formal problem
    statement)

16
Talk structure
  • Traditional streaming in networking
  • Rules of the game
  • Iteration paradigm packet scheduling example
  • New streaming problems
  • Detecting malicious traffic
  • Understanding network workloads

17
Informal problem definition
Analysis
Traffic reports
Applications 50 of traffic is
Kazaa Sources 20 of traffic comes from
Steves PC
Terabytes of measurement data
18
Informal problem definition
Analysis
Traffic reports
20 is Kazaa from Steves PC
50 is Kazaa from the dorms
Terabytes of measurement data
19
Formal problem definition
  • Define clusters
  • Atoms fields 1 to n with hierarchies in each
    field including
  • Cluster intersection of one set from each field
    hierarchy
  • Example Source, DestinationCS Net, App Email
  • Threshold clusters
  • Report traffic clusters above threshold T (e.g.
    1 of traffic)
  • Omit redundant clusters
  • Compression rule remove general clusters from
    report when its traffic can be inferred (up to
    error T) from on non-overlapping more specific
    clusters

20
Solution status
  • The good
  • Offline tool AutoFocus SIGCOMM 2003 paper
  • Detected worm, busy servers, squid cache, etc.
  • Network managers like it
  • The bad
  • Takes long 3 hours at T0.5 for one day trace
  • Needs much memory 300 Mbytes
  • The wanted
  • Streaming algorithm - we invite improvements

21
Conclusions
  • New rules strict constraints on algorithms
    running in routers
  • Iteration in time can give simple algorithms,
    but needs more formalization as to quality of
    results
  • General open problems many challenges in
    detecting malicious traffic such as worms and DoS
    attacks
  • Specific open problem computing traffic cluster
    reports in streaming fashion

22
Thank you!
Algorithms
?
Databases
Networking
23
Unidimensional clusters
40
35
15
35
30
160
110
75
10.8.0.2
10.8.0.3
10.8.0.4
10.8.0.5
10.8.0.8
10.8.0.9
10.8.0.10
10.8.0.14
24
Unidimensional clusters
500
10.8.0.0/28
10.8.0.0/29
10.8.0.8/29
120
380
10.8.0.0/30
10.8.0.4/30
10.8.0.8/30
10.8.0.12/30
75
305
50
70
10.8.0.10/31
10.8.0.2/31
10.8.0.4/31
10.8.0.8/31
10.8.0.14/31
50
70
270
35
75
40
35
15
35
30
160
110
75
10.8.0.2
10.8.0.3
10.8.0.4
10.8.0.5
10.8.0.8
10.8.0.9
10.8.0.10
10.8.0.14
25
Unidimensional clusters
500
10.8.0.0/28
10.8.0.0/29
10.8.0.8/29
120
380
10.8.0.0/30
10.8.0.4/30
10.8.0.8/30
10.8.0.12/30
75
305
50
70
10.8.0.10/31
10.8.0.2/31
10.8.0.4/31
10.8.0.8/31
10.8.0.14/31
50
70
270
35
75
40
35
15
35
30
160
110
75
10.8.0.2
10.8.0.3
10.8.0.4
10.8.0.5
10.8.0.8
10.8.0.9
10.8.0.10
10.8.0.14
26
Unidimensional clusters
500
10.8.0.0/28
10.8.0.0/29
10.8.0.8/29
120
380
10.8.0.8/30
305
10.8.0.8/31
270
160
110
10.8.0.8
10.8.0.9
27
Unidimensional clusters
500
10.8.0.0/28
10.8.0.0/29
10.8.0.8/29
120
380
10.8.0.8/30
305
10.8.0.8/31
270
160
110
10.8.0.8
10.8.0.9
28
Multidimensional clusters
  • Two dimensions
  • Source network
  • Protocol (traffic type)
  • Trees turn into lattice
  • Multiple parents
  • Nodes overlap

29
Offline solution
30
Sample report
Write a Comment
User Comments (0)
About PowerShow.com