Experiences With Internet Traffic Measurement and Analysis

About This Presentation
Title:

Experiences With Internet Traffic Measurement and Analysis

Description:

You do not have a lot of time to understand it. Characterizing Site Traffic ... Stone/Partridge (2000): between 1 in 1,100 and 1 in 32,000 ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 90
Provided by: vern157
Learn more at: http://www.icir.org

less

Transcript and Presenter's Notes

Title: Experiences With Internet Traffic Measurement and Analysis


1
Experiences With Internet Traffic Measurement
and Analysis
  • Vern Paxson
  • ICSI Center for Internet Research
  • International Computer Science Institute
  • and
  • Lawrence Berkeley National Laboratory
  • vern_at_icir.org
  • March 5th, 2004

2
Outline
  • The 1990s How is the Internet used?
  • Growth and diversity
  • Fractal traffic, heavy tails
  • End-to-end dynamics
  • Difficulties with measurement analysis
  • The 2000s How is the Internet abused?
  • Prevalence of misuse
  • Detecting attacks
  • Worms

3
The 1990s
  • How is the Internet Used?

4
80 growth/year
Data courtesy of Rick Adams
5
Internet Growth Exponential
  • Growth of 80/year
  • Sustained for at least ten years
  • before the Web even existed.
  • Internet is always changing. You do not have a
    lot of time to understand it.

6
(No Transcript)
7
Characterizing Site Traffic
  • Methodology passively record traffic in/out of a
    site
  • Danzig et al (1992)
  • 3 sites, 24 hrs, all packet headers
  • Paxson (1994)
  • TCP SYN/FIN/RST control packets
  • Gives hosts, sizes, start time, duration,
    application
  • Large filtering win ( 10-1001 packets, 1000s1
    bytes)
  • 7 month-long traces at Lawrence Berkeley Natl.
    Laboratory
  • 8 day-long traces from 6 other sites

8
Findings from Site Studies
  • Traffic mix (which protocols are used how many
    connections/bytes they contribute) varies widely
    from site to site.
  • Mix also varies at the same site over time.
  • Most connections have much heavier traffic in one
    direction than the other
  • Even interactive login sessions (201)

9
Findings from Site Studies, cont
  • Many random variables associated with connection
    characteristics (sizes, durations) are best
    described with log-normal distributions
  • But often these are not particularly good fits
  • And often their parameters vary significantly
    between datasets
  • The largest connections in bulk transfers are
    very large
  • Tail behavior is unpredictable
  • Many of these findings differ from assumptions
    used in 1990s traffic modeling

10
Theory vs. Measured Reality
  • Scaling behavior in Internet Traffic

11
Burstiness
  • Long-established framework Poisson modeling
  • Central idea network events (packet arrivals,
    connection arrivals) are well-modeled as
    independent
  • In simplest form, theres just a rate parameter,
    ?
  • It then follows that the time between calls
    (events) is exponentially distributed, of calls
    Poisson
  • Implications (if assumptions correct)
  • Aggregated traffic will smooth out quickly
  • Correlations are fleeting, bursts are limited

12
Burstiness Theory vs. Measurement
  • For Internet traffic, Poisson models have
    fundamental problem they greatly underestimate
    burstiness
  • Consider an arrival process Xk gives packets
    arriving during kth interval of length T.
  • Take 1-hour trace of Internet traffic (1995)
  • Generate (batch) Poisson arrivals with same mean
    and variance

13
(No Transcript)
14
Previous Region
?10
15
?100
16
?600
17
(No Transcript)
18
Burstiness OverMany Time Scales
  • Real traffic has strong, long-range correlations
  • Power spectrum
  • Flat for Poisson processes
  • For measured traffic, diverges to ? as ? ? 0
  • To build Poisson-based models that capture this
    characteristic takes many parameters
  • But due to great variation in Internet traffic,
    we are desperate for parsimonious models (few
    parameters)

19
Describing Traffic with Fractals
  • Landmark 1993 paper by Leland et al proposed
    capturing such characteristics (in Ethernet
    traffic) using self-similarity, a form of
    fractal-based modeling
  • Parameterized by mean, variance, and Hurst
    parameter
  • Models predict burstiness on all time scales
  • Queueing delays / drop probabilities much higher
    than predicted by Poisson-based models

20
(No Transcript)
21
Heavy Tails
  • Key prediction from fractal modeling
  • One way fractal traffic can arise in aggregate is
    if individual connections have activity periods
    (durations, sizes) whose distribution has
    infinite variance.
  • Infinite variance manifests in distributions
    upper tail
  • Consider Pareto distribution, F(x) (x/a)-?
  • If ? lt 2, then F(x) has infinite variance
  • Can test for Pareto fit by plotting log F(x) vs.
    log x
  • Straight line Pareto distribution, slope
    estimates -?

22
Web connection sizes(226,386 observations)
  • 28,000 observations
  • ? 1.3
  • ? Infinite Variance

23
Self-Similarity Heavy Tails, cont
  • We find heavy-tailed sizes in many types of
    network traffic. Just a few extreme connections
    dominate the entire volume.

24
(No Transcript)
25
Self-Similarity Heavy Tails, cont
  • We find heavy-tailed sizes in many types of
    network traffic. Just a few extreme connections
    dominate the entire volume.
  • Theorems then give us that this traffic
    aggregates to self-similar behavior.
  • While self-similar models are parsimonious, they
    are not (alas) simple.
  • You can have self-similar correlations for which
    magnitude of variations is small ? still possible
    to have a statistical multiplexing gain,
    especially at very high aggregation
  • Smaller time scales behave quite differently.
  • When very highly aggregated, they can appear
    Poisson!

26
End-to-End Internet Dynamics
  • Routing Packets

27
End-to-End Dynamics
  • Ultimately what the user cares about is not
    whats happening on a given link, but the
    concatenation of behaviors along all of the hops
    in an end-to-end path.
  • Measurement methodology deploy measurement
    servers at numerous Internet sites, measure the
    paths between them
  • Exhibits N2 scaling as sites grows, paths
    between them grows rapidly.

28
Measurement Infrastructure sites 1994-1995
End-to-End Dynamics Study
29
Path in the Study N2 Scaling Effect
30
End-to-End Routing Dynamics
  • Analysis of 40,000 traceroute measurements
    between 37 sites, 900 end-to-end paths.
  • Route prevalence
  • most end-to-end paths through the Internet
    dominated by a single route.
  • Route persistence
  • 2/3s of routes remain unchanged for days/weeks
  • 1/3 of routes change on time scales of seconds to
    hours
  • Route symmetry
  • More than half of all routes visited at least one
    different city in each direction
  • Very important for tracking connection state
    inside network!

31
End-to-End Packet Dynamics
  • Analysis of 20,000 TCP bulk transfers of 100 KB
    between 36 sites
  • Each traced at both ends using tcpdump
  • Benefits of using TCP
  • Real-world traffic
  • Can probe fine-grained time scales but using
    congestion control
  • Drawbacks to using TCP
  • Endpoint TCP behavior a major analysis headache
  • TCPs loading of the transfer path also
    complicates analysis

32
End-to-End Packet Dynamics Unusual Behavior
  • Out-of-order delivery
  • Not uncommon. 0.6-2 of all packets.
  • Strongly site-specific.
  • Generally little impact on performance.
  • Replicated packets
  • Very rare, but does occur (e.g., 1 packet in, 22
    out)
  • Corrupted packets (bad checksum)
  • Overall, 1 in 5,000 (!)
  • Stone/Partridge (2000) between 1 in 1,100 and 1
    in 32,000
  • Undetected between 1 in 16 million and 1 in 10
    billion

33
End-to-End Packet Dynamics Loss
  • Half of all 100 KB transfers experienced no loss
  • 2/3s of paths within U.S.
  • The other half experienced significant loss
  • Average 4-9, but with wide variation
  • TCP loss is not well described as independent
  • Losses dominated by a few long-lived outages
  • (Keep in mind this is 1994-1995!)
  • Subsequent studies
  • Loss rates have gotten much better
  • Loss episodes well described as independent
  • Same holds for regions of stable delay,
    throughput
  • Time scales of constancy ? minutes or more

34
Issues / Difficulties for Analyzing Internet
Traffic
  • Measurement, Simulation Analysis

35
There is No Such Thing as Typical
  • Heterogeneity in
  • Traffic mix
  • Range of network capabilities
  • Bottleneck bandwidth (orders of magnitude)
  • Round-trip time (orders of magnitude)
  • Dynamic range of network conditions
  • Congestion / degree of multiplexing / available
    bandwidth
  • Proportion of traffic that is adaptive/rigid/attac
    k
  • Immense size growth
  • Rare events will occur
  • New applications explode on the scene

36
Doubling every 7-8 weeks for 2 years
37
There is No Such Thing as Typical, cont
  • New applications explode on the scene
  • Not just the Web, but Mbone, Napster, KaZaA
    etc., IM
  • Event robust statistics fail.
  • E.g., median size of FTP data transfer at LBL
  • Oct. 1992 4.5 KB (60,000 samples)
  • Mar. 1993 2.1 KB
  • Mar. 1998 10.9 KB
  • Dec. 1998 5.6 KB
  • Dec. 1999 10.9 KB
  • Jun. 2000 62 KB
  • Nov. 2000 10 KB
  • Danger if you misassume that something is
    typical, nothing tells you that you are wrong!

38
The Search for Invariants
  • In the face of such diversity, identifying things
    that dont change has immense utility
  • Some Internet traffic invariants
  • Daily and weekly patterns
  • Self-similarity on time scales of 100s of msec
    and above
  • Heavy tails
  • both in activity periods and elsewhere, e.g.,
    topology
  • Poisson user session arrivals
  • Log-normal sizes (excluding tails)
  • Keystrokes have a Pareto distribution

39
The Danger of Mental Models
Exponential plus a constant offset
40
Not exponential - Pareto! Heavy tail ? 1.0
41
Versus the Power of Modeling toOpen Our Eyes
  • Fowler Leland, 1991
  • Traffic spikes (which cause actual losses) ride
    on longer-term ripples, that in turn ride on
    still longer-term swells

42
(No Transcript)
43
Versus the Power of Modeling toOpen Our Eyes
  • Fowler Leland, 1991
  • Traffic spikes (which cause actual losses) ride
    on longer-term ripples, that in turn ride on
    still longer-term swells
  • Lacked vocabulary that came from self-similar
    modeling (1993)
  • Similarly, 1993 self-similarity paper
  • We did so without first studying and modeling the
    behavior of individual Ethernet users (sources)
  • Modeling led to suggestion to investigate heavy
    tails

44
Measurement Soundness
  • How well-founded is a given Internet measurement?
  • We can often use additional information to help
    calibrate.
  • One source protocol structure
  • E.g., was a packet dropped by the network or
    by the measurement device?
  • For TCP, can check did receiver acknowledge it?
  • If Yes, then dropped by measurement device
  • If No, then dropped by network
  • Can also calibrate using additional information

45
Calibration Using Additional Information Packet
Timings
Routing change?
Clock adjustment
46
Reproducibilty of Results(or lack thereof)
  • It is rare, though sometimes occurs, that raw
    measurements are made available to other
    researchers for further analysis or for
    confirmation.
  • It is more rare that analysis tools and scripts
    are made available, particularly in a coherent
    form that others can actually get to work.
  • It is even rarer that measurement glitches,
    outliers, analysis fudge factors, etc., are
    detailed.
  • In fact, often researchers cannot reproduce their
    own results.

47
Towards Reproducible Results
  • Need to ensure a systematic approach to data
    reduction and analysis
  • I.e., a paper trail for how analysis was
    conducted, particularly when bugs are fixed
  • A methodology to do this
  • Enforce discipline of using a single (master)
    script that builds all analysis results from the
    raw data
  • Maintain all intermediary/reduced forms of the
    data as explicitly ephemeral
  • Maintain a notebook of what was done and to what
    effect.
  • Use version control for scripts notebook.
  • But also really need ways to visualize what's
    changed in analysis results after a re-run.

48
The 2000s
  • How is the Internet Abused?

49
(No Transcript)
50
(No Transcript)
51
Magnitude of Internet Attacks
  • As seen at Lawrence Berkeley National Laboratory,
    on a typical day in 2004
  • gt 70 of Internet connections (20 million out of
    28 million) reflect clear attacks.
  • 60 different remote hosts scan one of LBLs two
    blocks of 65,536 address in its entirety
  • More than 10,000 remote hosts engage in scanning
    activity
  • Much of this activity reflects worms
  • Much of the rest reflects automated
    scan-and-exploit tools

52
How is the Internet Abused?
  • Detecting Network Attacks

53
Design Goals for the Bro Intrusion Detection
System
  • Monitor traffic in a very high performance
    environment
  • Real-time detection and response
  • Separation of mechanism from policy
  • Ready extensibility of both mechanism and policy
  • Resistant to evasion

54
How Bro Works
  • Taps GigEther fiber link passively, sends up a
    copy of all network traffic.

Network
55
How Bro Works
Filtered Packet Stream
Tcpdump Filter
  • Kernel filters down high-volume stream via
    standard libpcap packet capture library.

libpcap
Packet Stream
Network
56
How Bro Works
Event Stream
Event Control
  • Event engine distills filtered stream into
    high-level, policy-neutral events reflecting
    underlying network activity
  • E.g., connection_attempt, http_reply,
    user_logged_in

Event Engine
Filtered Packet Stream
Tcpdump Filter
libpcap
Packet Stream
Network
57
How Bro Works
Real-time Notification Record To Disk
Policy Script
  • Policy script processes event stream,
    incorporates
  • Context from past events
  • Sites particular policies

Policy Script Interpreter
Event Stream
Event Control
Event Engine
Filtered Packet Stream
Tcpdump Filter
libpcap
Packet Stream
Network
58
How Bro Works
Real-time Notification Record To Disk
Policy Script
  • Policy script processes event stream,
    incorporates
  • Context from past events
  • Sites particular policies
  • and takes action
  • Records to disk
  • Generates alerts via syslog, paging
  • Executes programs as a form of response

Policy Script Interpreter
Event Stream
Event Control
Event Engine
Filtered Packet Stream
Tcpdump Filter
libpcap
Packet Stream
Network
59
Experiences with Bro
  • Exciting research because used operationally
    (24x7) at several open sites (LBL, UCB, TUM)
  • Key enabler sites threat model
  • Occasional break-ins are tolerable
  • Jewels are additionally protected (e.g.,
    firewalls)
  • Significant real-world concern policy management
  • Dynamic blocking critical to success
  • Currently, 100-200 blocks/day

60
The Problem of Evasion
  • Fundamental problem passively measuring traffic
    on a link Network traffic is inherently
    ambiguous
  • Generally not a significant issue for traffic
    characterization
  • But is in the presence of an adversary Attackers
    can craft traffic to confuse/fool monitor

61
Evading Detection ViaAmbiguous TCP Retransmission
62
The Problem of Crud
  • There are many such ambiguities attackers can
    leverage.
  • Unfortunately, they occur in benign traffic, too
  • Legitimate tiny fragments, overlapping fragments
  • Receivers that acknowledge data they did not
    receive
  • Senders that retransmit different data than
    originally
  • In a diverse traffic stream, you will see these
  • Approaches for defending against evasion
  • Traffic normalizers that actively remove
    ambiguities
  • Mapping of local hosts to determine their
    behaviors
  • Active participation by local hosts in intrusion
    detection

63
How is the Internet Abused?
  • The Threat of Internet Worms

64
What is a Worm?
  • Self-replicating/self-propagating code.
  • Spreads across a network by exploiting flaws in
    open services.
  • As opposed to viruses, which require user action
    to quicken/spread.
  • Not new --- Morris Worm, Nov. 1988
  • 6-10 of all Internet hosts infected
  • Many more since, but none on that scale .
  • until .

65
Code Red
  • Initial version released July 13, 2001.
  • Exploited known bug in Microsoft IIS Web servers.
  • 1st through 20th of each month spread.20th
    through end of each month attack.
  • Payload web site defacement.
  • Spread via random scanning of 32-bitIP address
    space.
  • But failure to seed random number generator ?
    linear growth.

66
Code Red, cont
  • Revision released July 19, 2001.
  • Payload flooding attack on
    www.whitehouse.gov.
  • Bug lead to it dying for date 20th of the
    month.
  • But this time random number generator correctly
    seeded. Bingo!

67
(No Transcript)
68
Network Telescopes
  • Idea monitor a cross-section of the IP address
    space to measure network traffic involving random
    addresses (flooding backscatter worm scanning)
  • LBLs cross-section 1/32,768 of Internet.
  • UCSDs cross-section 1/256.

69
Spread of Code Red
  • Network telescopes give lower bound on infected
    hosts 360K.
  • Course of infection fits classic logistic.
  • Note larger the vulnerable population, faster
    the worm spreads.
  • That night (? 20th), worm dies
  • except for hosts with inaccurate clocks!
  • It just takes one of these to restart the worm on
    August 1st

70
(No Transcript)
71
Striving for Greater Virulence Code Red 2
  • Released August 4, 2001.
  • Comment in code Code Red 2.
  • But in fact completely different code base.
  • Payload a root backdoor, resilient to reboots.
  • Bug crashes NT, only works on Windows 2000.
  • Localized scanning prefers nearby addresses.
  • Kills Code Red I.
  • Safety valve programmed to die Oct 1, 2001.

72
Striving for Greater Virulence Nimda
  • Released September 18, 2001.
  • Multi-mode spreading
  • attack IIS servers via infected clients
  • email itself to address book as a virus
  • copy itself across open network shares
  • modifying Web pages on infected servers w/ client
    exploit
  • scanning for Code Red II backdoors (!)
  • worms form an ecosystem!
  • Leaped across firewalls.

73
(No Transcript)
74
(No Transcript)
75
Life Just Before Slammer
76
Life Just After Slammer
77
A Lesson in Economy
  • Slammer exploits a connectionless UDP service,
    rather than connection-oriented TCP.
  • Entire worm fits in a single packet!
  • When scanning, worm can fire and forget.
  • Worm infects 75,000 hosts in 10 minutes (despite
    broken random number generator).
  • Progress limited by the Internets carrying
    capacity!

78
The Usual Logistic Growth
79
Slammers Bandwidth-Limited Growth
80
(No Transcript)
81
Blaster
  • Released August 11, 2003.
  • Exploits flaw in RPC service ubiquitous across
    Windows.
  • Payload attack Microsoft Windows Update.
  • Despite flawed scanning and secondary infection
    strategy, rapidly propagates to 100Ks of hosts.
  • Actually, bulk of infections are really Nachia, a
    Blaster counter-worm.
  • Key paradigm shift firewalls dont help.

82
What if Spreading WereWell-Designed?
  • Observation (Weaver) Much of a worms scanning
    is redundant.
  • Idea coordinated scanning
  • Construct permutation of address space
  • Each new worm starts at a random point
  • Worm instance that encounters another instance
    re-randomizes.
  • Greatly accelerates worm in later stages.

83
What if Spreading WereWell-Designed?, cont
  • Observation (Weaver) Accelerate initial phase
    using a precomputed hit-list of say 1 vulnerable
    hosts.
  • At 100 scans/worm/sec, can infect huge
    population in a few minutes.
  • Observation (Staniford) Compute hit-list of
    entire vulnerable population, propagate via
    divide conquer.
  • At 10 scans/worm/sec, infect in 10s of sec!

84
Defenses
  • Detect via honeyfarms collections of honeypots
    fed by a network telescope.
  • Any outbound connection from honeyfarm worm.
  • Distill signature from inbound/outbound traffic.
  • If telescope covers N addresses, expect detection
    when worm has infected 1/N of population.
  • Thwart via scan suppressors network elements
    that block traffic from hosts that make failed
    connection attempts to too many other hosts.

85
Defenses?
  • Observation worms dont need to randomly
    scan
  • Meta-server worm ask server for hosts to infect.
    E.g., query Google for index.html.
  • Topological worm fuel spread with local
    information from infected hosts (web server logs,
    email address books, config files, SSH known
    hosts)
  • No scanning signature with rich inter-
    connection topology, potentially very fast.

86
Defenses??
  • Contagion worm propagate parasitically along
    with normally initiated communication.
  • E.g., using 2 exploits - Web browser Web server
    - infect any vulnerable servers visited by
    browser, then any vulnerable browsers that come
    to those servers.
  • E.g., using 1 KaZaA exploit, glide along immense
    peer-to-peer network in days/hours.
  • No unusual connection activity at all! -(

87
Some Observations
  • Todays worms have significant real-world impact
  • Code Red disrupted routing
  • Slammer disrupted elections, ATMs, airline
    schedules, operations at an off-line nuclear
    power plant
  • Blaster possibly contributed to North American
    Blackout of Aug. 2003
  • But todays worms are amateurish
  • Frequent bugs, algorithm/attack botches
  • Unimaginative payloads

88
Next-Generation Worm Authors
  • Potential for major damage with more nasty
    payloads -(.
  • Military (cyberwarfare)
  • Criminals
  • Denial-of-service, spamming for hire
  • Access for Sale A New Class of Worm
    (Schecter/Smith, ACM CCS WORM 2003)
  • Money on the table ? Arms race

89
Summary
  • Internet measurement is deeply challenging
  • Immense diversity
  • Internet never ceases to be a moving target
  • Our mental models can betray us the Internet is
    full of surprises!
  • Seek invariants
  • Many of the last decades measurement questions
    -- What are the basic characteristics and
    properties of Internet traffic? -- have returned
  • but now regarding Internet attacks
  • What on Earth will the next decade hold??
Write a Comment
User Comments (0)