Title: IEPMPingER Internet Endtoend Performance Monitoring and the PingER project
1IEPM/PingERInternet End-to-end Performance
Monitoring and the PingER project
- Warren Matthews and Les Cottrell (SLAC)
- National Collaboratory Middleware and Network
Research Project Review, ANL, - August 18-20, 2003.
2Overview
- A little History
- Evaluate the progress
- Assess the Value
- Interactions with other projects
- Elements that should be added
- Summary
3History
- Ping End-to-end Reporting
- Began early 1995
- Monitor network performance to sites
collaborating with SLAC - ESnet Network Monitoring Task Force (NMTF)
- Extended to several DoE labs, strong support from
FNAL - 1997 ICFA created Network Task Force
- PingER spreads worldwide
- Funded by DOE/MICS
4Recently
- In 2001, extended PingER to include bandwidth
testing - IEPM-BW
- End-to-end user perception for high performance
bulk-transfer - Iperf, bbftp, GridFTP
- Heavy network impact compared to lightweight
PingER
5Current Status
- PingER funding is under Thomas Ndousse
- DoE/MICS funding runs out at end of year
- Continues to be extremely useful
- Most recently began working with ICTP/eJDS to
quantify the Digital Divide - MAGGIE proposal to develop/extend high
performance monitoring (with PSC, ICIR, LBNL)
6Overview
- A little History
- Evaluate the progress
- Assess the Value
- Interactions with other projects
- Elements that should be added
- Summary
7PingER
- Mature, Successful
- Widely used in HENP
- Utilization has been extended beyond HENP
- EDG, IAEA, XIWT
- ICTP/eJDS
- Many others
- Continues to be extended to meet new needs
- Better visualization, web services access to data
8PingER Methodology
- Simple ping monitoring
- 1 ping to prime caches
- Send, size
- Default is 10x100 Byte pkts, 10x1000 Byte pkts
- Record ping packet loss and RTT
- Derive unreachability, quiescence,
unpredictability, jitter, TCP throughput - Also out-of-order packets, duplicate packets
9PingER Software
- Monitoring
- Analysis
- Visualization
- Available from SLAC/FNAL websites
- Package
10Using PingER
- Since 1995
- Trouble shooting
- Identify Sites to Upgrade
- Choosing a provider
- Setting expectations for VoIP
- Routing Choices for multihomed sites
- Compare with http, ftp
- Strong correlation
11PingER Deployment
- Currently 36 monitoring sites in 14 countries
- 473 target sites in 79 countries
- 99 of the worlds on-line population
- Most extensive end-to-end active RE network
monitoring worldwide - Special BaBar, PPDG, Digital Divide. etc groups
and pages
12End-to-end Monitoring
- In reality, most projects monitor end-to-end
performance - End host effects are unavoidable
- Internet2 end-to-end Performance Initiative
(e2epi) has recognized this - Most useful to users.
13IEPM-BW
- Throughput Monitoring
- Traceroute
- Iperf (quick iperf), BBftp, BBCP (mem and disk)
- ABWE (available bandwidth)
- GridFTP, UDPMON
- Web100
- Netflow
- Analysis
14IEPM-BW Deployment
- Currently 10 monitoring sites
- SLAC, FNAL, GATech (SOX)
- INFN (Milan), NIKHEF, APAN (Japan)
- UMich, Internet2 (Ann Arbor)
- UManchester, UCL (UK)
- 50 unique target sites
15Using IEPM-BW
- Usual
- Baselines
- Troubleshooting
- Setting expectations
- Also on both testbeds and production nets
- Compare measurement tools (ping vs ABwE vs.
iperf/quick iperf vs bbcp vs GridFTP vs tsunami) - Compare advanced TCP stacks
- Eliminate need for multiple streams
- Look at non TCP bulk transfer
16Overview
- A little History
- Evaluate the progress
- Assess the Value
- Interactions with other projects
- Elements that should be added
- Summary
17Examples
- Long term trends
- Short term glitches
- Troubleshooting
- Upgrades
- Vacations
- Peering
182Mbps
vacation
Multiple OC12s
Traffic on ESnet has doubled every year
19(No Transcript)
20Ten-155 became operational on December 11.
To North America
Smurf Filters installed on NORDUnets US
connection.
To Western Europe
21(No Transcript)
22(No Transcript)
23Traffic
Typically, Internet traffic is 70 http
24Conclusions
- Establish layer 3 connectivity exists
- Iperf vs Quick iperf
- BBftp vs BBCP gt implementation
- IPERF vs BBftp gt CPU, Disk
- Packet Loss lt 0.1
- TCP/IP must be tuned on high-speed long delay
paths - Web100/Net100
25eJDS
- PingER continues to be useful
- Recently joined with electronic journal
distribution service (eJDS) - Distribute physics journals to member around the
world - Particularly concerned with quantifying the
Digital Divide
26Limitations
- ICMP
- Do not monitor routers
- Rate limiting
- Blocking is common, especially in developing
countries - However, study indicates low impact from rate
limiting - Scheduling with cron
27Overview
- A little History
- Evaluate the progress
- Assess the Value
- Interactions with other projects
- Elements that should be added
- Summary
28Comparison to Other Projects
- Surveyor
- RIPE
- AMP
- NIMI
- SCNM
- XIWT
- NetPhysics
29Comparisons
- Typically results were closely correlated.
- Often tools complement each other and combined
provide insight into network behaviour. - Derived throughput from equation of Mathis et al
(BWMSS/(RTTsqrt(loss)) ) shows good agreement
30Publishing
- Network Performance information is critical to
the Grid vision - Application steering
- Working with GGF/NMWG PPDG
- Monitoring data is available as prototype Web
Service - OGSI Grid service under development
31Internet2 PIPES
- E2e pi
- PIPES infrastructure
- IEPM-BW Job manager
- MAGGIE Analysis Engine
32Available Bandwidth Estimator (ABwE)
- Tool under development by SLAC/Rice
- Part of the DoE/SCIDAC INCITE project
- Light weight
- 60 packets in 1 second
- Iperf 35,000 packets/s for 10-20 seconds
- No need to tune windows/streams
- Replace iperf in test engine
- FreeBSD version created for Abilene Backbone
Measurement Infrastructure
33Quick Iperf
- Iperf is the tool of choice for many admins.
- Considered accurate but intrusive.
- Errors due to long slow start
- Use web100 to detect end of slow start. Modify
iperf client. Web100 required on client only. - Measurement within 10
- Save 94 time, 92 traffic
34PingER-6
- SLAC has native IPv6 service from ESnet
- PingER ported to IPv6
- Monitoring started in November 1999
- 41 Sites in 10 countries
- edu/ac., net/net., com/co.
35PingER -vs- PingER6
RTT between SLAC and Purdue in Nov and Dec 1999.
IPv6
IPv4
36Overview
- A little History
- Evaluate the progress
- Asses the Value
- Interactions with other projects
- Elements that should be added
- Summary
37MAGGIE
- Need to further develop IEPM-BW
- On-demand measurements, visualization, automated
trouble shooting - Measurement and Analysis for the Global Grid and
Internet End-to-end performance - A secure, scalable measurement infrastructure
providing measurement, analysis and access to
data.
38MAGGIE
IEPM-BW Measurement Engine
SLAC
PSC
ICIR
Akenti
NWS
FNAL
NIMI Security and scheduling
Other tools
MAGGIE
ANL
SCIDAC
Publishing
AMP
Fault Finding Analysis Engine
NMWG
LBNL
RIPE
SLAC
UCL
SLAC
39Overview
- A little History
- Evaluate the progress
- Assess the Value
- Interactions with other projects
- Elements that should be added
- Summary
40Meeting the Objectives (1/4)Evaluate the Progress
- The problem The user cannot assume the network
will be there. - Even if it is, the user cannot assume it will
perform to their expectation. - The vision (realized) PingER has set
expectation, provides data for troubleshooting,
provides data for research. Continues to be
useful. - A Unique contribution Probably the largest
monitoring project in the world. IEPM-BW
comparing tools, leveraging other efforts.
41Meeting the Objectives (2/4)Assess the value
- PingER is widely used and continues to be useful.
- Goals get more ambitious
- Challenges remain
42Meeting the Objectives (3/4)Interactions Across
Projects
- Long history of involvement in other projects
- HENP, ESnet, Grid, High Performance, ICFA-SCIC
- Friends, colleagues and contacts throughout the
world (Other worlds coming soon) - Bright future for MAGGIE.
43Meeting the Objectives (4/4)Assess the
Integrated Impact
- The contribution to the big picture by
IEPM-PingER, IEPM-BW and especially the need for
MAGGIE have been summarized by Mary-Anne and
Thomas - But they may not have known it
44The Big Picture
All of the National Collaboratory and Network
Research projects have specific goals and
objectives, but all of you involved in those
projects are also part of a much larger, longer
term effort, namely creating an infrastructure
that will enable geographically separated
scientists to effectively work together as a team
and that will facilitate remote access to both
facilities and data. -Mary-Anne and Thomas
45Toward a Monitoring Infrastructure
- Certainly the need
- DOE Science Community, SCIDAC Testbed
- Grid, Large Scale Networking
- Troubleshooting / E2Epi
- Many of the ingredients
- Many monitoring projects
- Many tools
- PIPES, MAGGIE (Cross domain)
46Summary
- Unfortunately, network management research has
historically been very under-funded, because it
is difficult to get funding bodies to recognize
this as legitimate networking research. - Sally Floyd
- IAB Concerns Recommendations Regarding Internet
Research Evolution. - http//www.ietf.org/internet-drafts/draft-iab-rese
arch-funding-00.txt
47Links
- Accompanying paper
- IEPM-BW Home
- 7 papers and 35 talks in the last 12 months
- ABwE
- RIPE-TT
- E2E PI
- GGF NMWG
- AMP TroubleShooting
- Quick Iperf
48Credits
- Connie Logg, Jerrod Williams (SLAC), Jiri
Navratil (CESnet/SLAC), David Martin, Frank Nagy,
Al Thomas, Maxim Grigoriev (FNAL), Fabrizio
Coccetti (INFN/SLAC). - Brian Tierney, Eric Boyd, Jeff Boote, Matt
Zekauskas, Matt Mathis, Russ Hobby, Vern Paxson,
Andy Adams, kc Claffy, Iosif Legrand, Ajay
Tirumala, Tom Dunigan. - Local admins and other volunteers
- DoE/MICS