Les Cottrell - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Les Cottrell

Description:

End-to-end Monitoring in Esnet/HENP & the relevance of ping Les Cottrell SLAC & Stanford University Presented at the NLANR organized ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 31
Provided by: cott57
Category:
Tags: cottrell | les | physics

less

Transcript and Presenter's Notes

Title: Les Cottrell


1
End-to-end Monitoring in Esnet/HENP the
relevance of ping
  • Les Cottrell
  • SLAC Stanford University
  • ltcottrell_at_slac.stanford.edugt
  • Presented at the NLANR organized workshop on
    Challenges and opportunities
  • for measurements and analysis in a high
    performance environment, July 1, 1999, SDSC.
  • Partially funded by DOE/MICS Field Work Proposal
    on Internet End-to-end Performance Monitoring
    (IEPM)

2
Outline of talk
  • Review relevance of tools, deployment
  • Illustrate the type of information that is
    provided and how it relates to applications, e.g.
    TCP VoIP
  • Long term trends
  • Community comparisons
  • Challenges
  • continued validity of ping, comparison with
    Surveyor
  • other work coordination
  • Future work

3
Main tool (PingER) currently uses Ping
  • Treats Internet as black box
  • Provides useful real world measures of network
    round trip response time, loss, reachability,
    jitter
  • Low cost/lightweight tool
  • ping universally available, easy to understand
  • no software for clients to install
  • no special privileges needed for monitor sites
  • resources 100bps/link, 600kBytes/month/link
  • Ping mature, well understood, widely available

4
Examples of relevance to applications
  • Relates to Web performance small files dominated
    by RTT
  • BWTCP lt (MSS/RTT)(1/sqrt(loss))

5
Scale of Measurements
  • 18 Monitoring sites - 7 in US (5 ESnet, 2 vBNS),
    2 in Canada, 7 in Europe (ch, de, dk, hu, it,
    uk(2)), 2 in Asia (jp, tw)
  • 1261 monitoring-remote-site pairs
  • 379 unique hosts, 272 sites
  • 50 beacon sites, 27 countries
  • Metrics include response, jitter, loss,
    reachability
  • Data goes back gt 4 years
  • 1 Million probes of Internet / day

6
XIWT/IPERF
  • 2nd instance of PingER tools deployed by XIWT
  • at 10 monitoring sites (Bellsouth, CNRI,
    Digital/Compaq(2), DirecPC, HP, Intel, NIST,
    SLAC, Westgroup)
  • mainly full mesh pinging
  • 150 pairs
  • different community of interest - more commercial
    (70 .com, 20 .org, 10 .edu)

7
WebInterface
http//www.slac.stanford.edu//xorg/iepm/pinger/tab
le.html
8
Effect of STAR-TAPon KEK.jp ltgtSLAC
400
Ping RTT in msec.
50
packet loss
200
September 1 to December 31 1998
0
0
9
Improvement in RTT
10
Improvement in packet loss
11
Bandwidth improvement from ESnet sites
TCP bandwidth lt (1470/RTT) (1/sqrt(loss))
12
ESnet, I2, XIWT,Euro-Labs
13
Calibration of ping
  • Sanity checks
  • host pings itself, host pings host at same site
  • high statistics between a few sites inside
    site
  • see www.slac.stanford.edu/comp/net/wan-mon/ping-hi
    -stat.html
  • look at subtle behaviors, e.g. RTT distribution
    tails
  • check wire time (sniffer) vs. ping reported
    times, at client server
  • see www.slac.stanford.edu/comp/net/wan-mon/error.h
    tml
  • Correlate with Surveyor one-way measures

14
Natural enemies of ping
  • Poor choice of remote host (clustered, variable
    load..) or monitoring host
  • Ping program problems and pathologies
  • Some implementations have bugs, or are incomplete
  • Spurious packets confuse ping programs (lt0.2
    effect)
  • e.g. program sends 5 packets sees 10.
  • Out of order packets (lt 0.02 effect)
  • Some sites/hosts block pings
  • Other sites limit pings to a certain size
  • Rate limiting, e.g. some sites filter out ICMP
    traffic during high usage or all the time

15
Impact of limiting
16
Ping limiting/blocking
  • First noticed in 1996
  • protect against ping odeath (OS) smurf attacks
    (directed broadcasts)
  • Host requirement to implement ping
  • but not to execute, and probably blocked at
    firewall
  • First step for cracker scanning a site
  • Identified at 2 hosts (i.e. currently a small
    effect)
  • http//www.slac.stanford.edu/comp/net/wan-mon/path
    ology.html

17
Avoiding
  • careful choice of host gt beacon sites
  • working with remote sites ISPs
  • using TCP echo or UDP echo (security), but
    crackers will find them and often already blocked
  • new protocol designed for measurement (IPMP)
  • special purpose measurement machines protocols

18
Surveyor / RIPE
  • Dedicated PC running Unix at key sites
  • GPS for clock synchronization
  • One way delay loss measurements
  • Community is Internet 2 clients,
  • HEP sites collaborating with Surveyor
  • deployed in HENP community (CERN (Geneva), FNAL
    (Chicago) SLAC (Silicon Valley - SF))
  • using PingER analysis tools on Surveyor data

19
Comparing PingER Surveyor
20
Comparing Surveyor ping/ER results
  • Took Surveyor data between SLAC, FNAL CERN,
    Nov-98 thru May-99
  • Reformatted into PingER format, allows viewing
    with PingER tools
  • metrics loss, delay, unreachability,
    unpredictability
  • hourly, daily, monthly ticks
  • sorting, exporting to Excel
  • Also made some high statistics ping measurements
    compared with Surveyor

21
Surveyor vs. ping RTT
22
Surveyor vs. ping
23
Surveyor vs. Ping Correlation
24
PingER Surveyor
25
PingER vsSurveyor
26
PingER - Surveyor Complementarity
  • Agree well
  • Surveyor has one way measurements, PingER only
    round-trip
  • Surveyor dedicated platforms strong central
    management
  • experience with PingER shows this has benefits.
  • PingER more parsimonious/lightweight (bandwidth,
    disk space, cpu)
  • better for poor connectivity sites - e.g. Russia,
    China
  • but necessarily less accurate especially at small
    (hourly) time resolution on low loss links.
  • PingER good for looking at long term trends
    grouping where statistics are less a problem.

27
Work in progress 1/2
  • Random scheduling of pings (in beta at 2 sites)
  • Recording more information (in production at
    XIWT)
  • Flexibility in choice of packet sizes,
    frequencies (tailor to bandwidth between pair)
  • Look for ICMP rate limiting signatures
  • Install RIPE engine at SLAC,
  • Correlate AMP data (AMP running at SLAC)
  • Gather historical route information for PingER
  • Calibrate ping jitter against VoIP jitter

28
Work in Progress 2/2
  • Calibrate using ping to measure QoS effects
  • setting up QoS testbed between SLAC LBNL
  • Other possibilities
  • Thinking about extending framework to other
    apps, e.g. following IPMP work, TCP/UDP echo,
    http (CERN interested, XIWT also interested,
    there are also commercial tools, but expensive)
  • Generate alerts (HEPNRC)

29
Monitoring Conclusions
  • Performance is improving
  • ESnet vBNS/Internet 2 well configured provide
    good service within between their nets
  • Performance within AR networks is generally good
  • Minimize ISPs crossed, peering critical
  • Intercontinental performance is poor to bad
  • Today need headroom, or managed bandwidth, QoS in
    future
  • End users need monitoring to know what to expect,
    write SLAs, set baselines, ID problems, plan

30
More Information extra info follows
  • WAN Monitoring at SLAC has lots of links
  • http//www.slac.stanford.edu/comp/net/wan-mon.html
  • Tutorial on WAN Monitoring (including methods,
    RTT, jitter, loss QoS thresholds etc.)
  • http//www.slac.stanford.edu/comp/net/wan-mon/tuto
    rial.html
  • PingER History tables
  • http//www.slac.stanford.edu//xorg/iepm/pinger/tab
    le.html
  • Internet Monitoring in the HEP Community,
    SLAC-PUB-7961, presented at CHEP98, Chicago,
    Aug-98
  • http//www.slac.stanford.edu/pubs/slacpubs/7000/sl
    ac-pub-7961.html
Write a Comment
User Comments (0)
About PowerShow.com