Title: An Introduction to ESnet and its Services
1An Introduction to ESnetand its Services
- William E. Johnston, ESnet Manager and Senior
Scientist - Michael S. Collins, Stan Kluz,Joseph Burrescia,
and James V. Gagliardi, ESnet Leads - and the ESnet Team
2Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
3ESnet
- An infrastructure that is critical to DOEs
science mission and that serves all of DOE - Focused on the Office of Science Labs
- Complex and specialized both in the network
engineering and the network management - You cant go out and buy this ESnet integrates
commercial products and in-house software into a
complex management system for operating the net - You cant go out and take a class in how to run
this sort of network it is specialized and is
learned from experience - Extremely reliable in several dimensions
- ESnet has functioned flawlessly during the
current turmoil
4Stakeholders
- DOE MICS Office, ESnet program
- ESnet Steering Committee (ESSC)
- represents the Science Offices (strategic needs)
- ESnet Coordinating Committee (ESCC)
- site representatives (operational issues)
- Users
- Mostly DOE Office of Science
- NNSA / Defense Programs
- DOE collaborators
- A few others (e.g. the NSF LIGO site)
5Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
6Several Workshops Have Solicited Input from the
Science Community
August 13-15, 2002
DOE Organizing Committee Mary Anne Scott, Chair
Dave Bader Steve Eckstrand Marvin Frazier
Dale Koelling Vicky White
Workshop Panel Chairs Ray Bair and Deb
Agarwal Bill Johnston and Mike Wilde Rick
Stevens Ian Foster and Dennis Gannon Linda
Winkler and Brian Tierney Sandy Merola and
Charlie Catlett
- Focused on science drivers for
- Advanced Infrastructure
- Middleware Research
- Network Research
- Network Provisioning Model
- Network Governance Model
7Eight Major DOE Science Areas Analyzed at the
August 02 Workshop
8Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
9ESnet Architecture and Terminology
Applications
Application level transport(HTTP, FTP, Telnet,
etc.)
TCP Internet reliabletransport protocol
UDP Internet unreliabletransport protocol
RTP, Group Communication, etc.
IP addressing, routing, and basic packet based
data transport, 64 Kilobyte max packet size)
- ATM Asynchronous Transfer Mode
- 53 Byte cells, 5B header, 48B data payload
- IP packets fragmented into ATM data payloads
- ATM cells are routed between ATM switches
- Physical layer is mostly local optical fiber or
SONET
- Ethernet
- - IP encapsulated in Ethernet packets for
transport of IP packets - Physical layer is local area copper
wiretwisted pair or local or wide area optical
fabric - frame size 1200 By to 9000 By
Packet-Over-SONET - IP encapsulated in SONET
framesfor transport on optical fabric - frame
size KB to 10s of KBs)
Telecomm SONET
ESnet
Dense Wave Division Multiplexed (DWDM /
lambda) optical fabric(e.g. the Qwest/ESnet
OC48/OC192 ring is 2 lambdas one receive
channel and one transmit channel)
10Terminology
- Network bandwidth is typically given in
bits/second - E.g. 1 Gigabit/sec (G/s) is 1000 Megabits/sec)
- Data transport rates are typically given in
Bytes/month - E.g.1 Terabyte/month is 1,000,000 Megabytes/month
11Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
12ESnet IP
Japan
QWEST ATM
International (high speed) OC192 (10G/s
optical) OC48 (2.5 Gb/s optical) Gigabit Ethernet
(1 Gb/s) OC12 ATM (622 Mb/s) OC12 OC3 (155
Mb/s) T3 (45 Mb/s) T1-T3 T1 (1 Mb/s)
42 end user sites
13ESnet Site Architecture
New York (AOA)
Chicago (CHI)
Washington, DC (DC)
The Hubs have lots of connections(42 in all)
Backbone(optical fiber ring)
Atlanta (ATL)
Sunnyvale (SNV)
ESnet responsibility
Site responsibility
El Paso (ELP)
Hubs(backbone routers and local loop connection
points)
Site gateway router
ESnet border router
SiteLAN
Local loop (Hub to local site)
DMZ
Site
14While There is One Backbone Provider, there
areMany Local Loop Providers to Get to the Sites
NY-NAP
QWEST ATM
LBNL/ CalRen2
GTN
DOE-NNSA
PANTEX
Qwest Contracted
Touch America Contracted/Owned
MCI Contracted/Owned
Site Contracted/Owned
15ESnet Logical InfrastructureConnects the DOE
Community With its Collaborators
ESnet connects to most universities via
high-speed Abilene peering points.There are many
commercial peers (logical network connections
through a fairly small number of physical
connections) because there are lots of commercial
nets.
16ESnet Has Experienced Exponential Growth Since
1992
Annual growth in the past five years has
increased from 1.7x annually to just over 2.0x
annually.
17Who Generates Traffic, and Where Does it Go?
ESnet Inter-Sector Traffic Summary, Jan 2003
72
21
Commercial
14
DOE is a new supplier of data because DOE
facilities are used by Univ. and commercial, as
well as by DOE researchers
ESnet
17
25
DOE sites
RE
10
53
9
International
DOE collaborator traffic, inc.data
4
Peering Points
ESnet Appropriate Use Policy (AUP) All ESnet
traffic must originate and/or terminate on an
ESnet an site (no transit traffic is
allowed) E.g. a commercial site cannot exchange
traffic with an international site across
ESnet This is effected via routing restrictions
ESnet Ingress Traffic Green ESnet Egress
Traffic Blue Traffic between sites of total
ingress or egress traffic
18ESnet Has a Service Compact With its Users
- Low and relatively constant latency for packet
delivery is essential for the smooth functioning
of distributed applications - The network core is engineered for less than 50
ms average latency, lt 150 ms when the network
partitions
Re-routes mostly due to scheduled maintenance
also indicates the latency if the ring partitions
in various places (this graph actually shows an
unusually high number of Qwest maintenance
outages)
19Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
20ESnet is Not Just One Network
- Part of the complexity of ESnet is that it is
actually four networks - The IP Internet (IPv4) network that most people
see(as described above) - SecureNet that serves the NNSA / Defense Programs
Labs (encrypted, encapsulated ATM) - IPv6 network backbone (next generation Internet
protocol network) - IP Multicast
- Each of these uses a different routing and/or
addressing mechanism
21SecureNet
- SecureNet connects 9 NNSA (Defense Programs)
sites (a 10th site at HQ is being added) - The NNSA sites exchange encrypted ATM traffic
- The data is unclassified when ESnet gets it
because it is encrypted before it leaves the NNSA
sites with an NSA certified encrypter - Runs over the ESnet core backbone as a layer 2
overlay that is, the SecureNet encrypted ATM is
transported over ESnets Packet-Over-SONET
infrastructure by encapsulating the ATM in a
special protocol (MPLS)
22SecureNet Mid 2003
Backup SecureNet Path
AOA-HUB
CHI-HUB
SNV-HUB
DC-HUB
LLNL
SNLL
ORNL
KCP
DOE-AL
Pantex
LANL
SNLA
SRS
Primary SecureNet Path
ATL-HUB
ELP-HUB
SecureNet encapsulates payload encrypted ATM in
MPLSusing the Juniper Router Circuit Cross
Connect (CCC) feature.
23IPv6-ESnet Backbone
9peers
18 peers
6peers
BNL
StarLight
7peers
StarTap
Distributed 6TAP
PAIX
LBL
Chicago
Sunnyvale
New York
ANL
FNAL
DC
Albuquerque
Atlanta
SLAC
El Paso
- IPv6 is the next generation Internet protocol,
and ESnet is working on addressing deployment
issues - one big improvement is that while IPv4 has 32 bit
about 4x109 addresses (which we are running
short of), IPv6 has 132 bit about 1040
addresses (which we are not ever likely to run
short of) - another big improvement is native support for
encryption of data
24Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
25Services for Science Collaboration
- Seamless voice, video, and data teleconferencing
is important for geographically dispersed
collaborators - ESnet currently provides voice conferencing,
videoconferencing (H.320/ISDN scheduled, H.323/IP
ad-hoc), and data collaboration services to more
than a thousand DOE researchers worldwide - Heavily used services, averaging around
- 4600 port hours per month for H.320
videoconferences, - 2000 port hours per month for audio conferences
- 1100 port hours per month for H.323
- approximately 200 port hours per month for data
conferencing
26Voice, Video, and Data Collaboration
- Web-Based registration and scheduling for all of
these services - authorizes users efficiently
- lets them schedule meetings
- Such an automated approach is essential for a
scalable service ESnet staff could never handle
all of the reservations manually
27Public Key Infrastructure
- Digital Identity certificates issued by ESnet
DOEGrids CA are essential for the trust
management needed for cross-site resource sharing
(e.g. international HEP collaborations) - The rapidly expanding customer base of this
service will soon make it ESnets largest
collaboration service by customer count
28Services for Science Collaboration
- Public Key Infrastructure to support cross-site,
cross-organization, and international trust
relationships that permit sharing computing and
data resources - Digital identity certificates for people, hosts
and services essential core service for Grid
middleware - provides formal and verified trust management
an essential service for widely distributed
heterogeneous collaboration, e.g. in the
International High Energy Physics community - Policy Management Authority negotiates and
manages the formal trust instrument (Certificate
Policy - CP) - Certificate Authority (CA) validates users
against the CP and issues digital identity certs. - Certificate Revocation Lists are provided
- This service was the basis of the first routine
sharing of HEP computing resources between US and
Europe
29Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
30ESnet is Different from a Commercial ISPor
University Network
- A fairly small number of very high bandwidth
sites (commercial ISPs have thousands of low b/w
sites) - Runs SecureNet as an overlay network
- Provides direct support of DOE science through
various collaboration services - ESnet owns all network trouble tickets (even
from end users) until they are resolved - one stop shopping for user network problems
31Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
32ESnet is Complex There are 6 Databases for the
State of the Network and Several More for
Performance
Topology
Performance
OSPF Metrics
SecureNet
Hub Configuration
IBGP Mesh
Engineering Web Site Maps Diagrams all are
clickable, allowing drilldown to finest levels of
detail of the underlying databases
33Drill Down into the Performance DB to Every
Physical and Logical Interface level for Every
Router
Real-time monitoring of traffic levels of some
4400 network entities is one of the primary
network diagnosis tools
- 1 and 2 min, 2 hr, and daily averages
- hours to months of historical data kept
on-line
34Drill Down into the Topology DB to Operating
Characteristics of Every Device
e.g. inlet, hot-point, and exhaust cooling air
temperature
35Drill Down into the Hub Configuration DBfor
Every Wire Connection
Equipment rack detail at AOA, NYC Hub (one of the
core optical ring sites)
36The Hub Configuration Database
Equipment wiring detail for one module at the
AOA, NYC Hub (this particular module allows
remote power cycling of all of the equipment)
37Qwest DS3 DCX
Sentry power 48v 30/60 amp panel (3900 list)
AOA Performance Tester (4800 list)
Sentry power 48v 10/25 amp panel (3350 list)
DC / AC Converter (2200 list)
Cisco 7206 AOA-AR1 (links to MIT PPPL) low
speed links (38,150 list)
Lightwave Secure Terminal Server (4800 list)
ESnet Equipment _at_ Qwest 32 AofA HUB NYC,
NY (1.8M, list)
Juniper T320 AOA-CR1 (Core RTR) (1,133,000
list)
Juniper OC48 Optical Ring Interface (the AOA end
of the OC48 to DC-HUB (65,000 list)
Juniper OC192 Optical Ring Interface (the AOA end
of the OC192 to CHI (195,000 list)
Juniper M20 AOA-PR1 (peering RTR) (353,000 list)
38Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
39Operating Science Mission Critical Infrastructure
- ESnet is a visible and critical pieces of DOE
science infrastructure - if ESnet fails,10s of thousands of DOE and
University users know it within minutes if not
seconds - Requires high reliability and high operational
security in the systems that are integral to the
operation and management of the network - Secure and redundant mail and Web systems are
central to the operation and security of ESnet - trouble tickets are by email
- engineering communication by email
- engineering database interface is via Web
- Secure network access to Hub equipment
- Backup secure telephony access to Hub equipment
- 24x7 help desk (joint with NERSC)
- 24x7 on-call network engineer
40Disaster Recovery and Stability
- The network operational services must be kept
available even if, e.g., the West coast is
disabled by a massive earthquake, etc. - Network engineers in four locations across the
country - Full and partial engineering databases and
network operational service replicas in three
locations - Telephone modem backup access to all hub equipment
41Disaster Recovery and Stability
Remote Engineer Spectrum Eng Srvr Load
Srvr Config Srvr Public Web E-mail
Engineers Eng Srvr Load Srvr Config Srvr
Engineers Spectrum (net mgmt) DNS Eng Srvr Load
Srvr Config Srvr Public Web E-mail
DNS
Remote Engineer
- All core network hubs are co-located in
commercial telecommunication facilities with
backup power - ESnet backbone operated without interruption
through the N. Calif. blackout, the 9/11 attacks,
and the9/03 NE States blackout
42Maintaining Science Mission Critical
Infrastructurein the Face of Attack
- A Phased Security Architecture is being
implemented to protect the network and the sites - The phased response ranges from blocking certain
site traffic to a complete isolation of the
network which allows the sites to continue
communicating among themselves in the face of the
most virulent attacks - Separate ESnet core routing functionality from
our external Internet connections by means of a
peering router that can have a policy different
from the core routers - Provide a rate limited path to the external
Internet that will insure site-to-site
communication during an external denial of
service attack - Allow for Lifeline connectivity that allows
downloading of patches, exchange of e-mail and
viewing web pages (i.e. e-mail, dns, http,
https, ssh, etc.) with the external Internet
prior to full isolation of the network
43ESnet WAN Security and Cybersecurity
- ESnet security for its own network equipment is
provided by - secure access to devices
- patching router operating systems
- confidentially of configuration data, etc.
44ESnet WAN Security and Cybersecurity
- Cybersecurity is a new dimension of ESnet
security - Security is now inherently a global problem
- As the entity with a global view of the network,
ESnet has an important role in overall security
30 minutes after the Sapphire/Slammer worm was
released, 75,000 hosts running Microsoft's SQL
Server were infected. (The Spread of the
Sapphire/Slammer Worm, David Moore (CAIDA UCSD
CSE), Vern Paxson (ICIR LBNL), Stefan Savage
(UCSD CSE), Colleen Shannon (CAIDA), Stuart
Staniford (Silicon Defense), Nicholas Weaver
(Silicon Defense UC Berkeley EECS)
http//www.cs.berkeley.edu/nweaver/sapphire )
45ESnet and Cybersecurity
Sapphire/Slammer worm infection hits creating
almost a Gb/s traffic spike on the ESnet backbone
46ESnet and Cybersecurity
- ESnet protects itself and other sites infected
ESnet sites can be blocked, partially or
completely - ESnet can come also come to the aid of an ESnet
site with temporary filters on incoming traffic,
etc., if necessary - This is one of the very few areas where ESnet
might participate directly in site security - Request must come from Site Coordinator
- Not a substitute for good site security
47ESnet and Cybersecurity
Sapphire/Slammer worm infection hits at
approximately 930PM PST, Friday night, 25 Jan 03
ESnet applies filters at both the hub and the
site to block attack
Attacks coming from the site (blocked at hub)
Slammer Traffic to Site
Site responds
ESnet-site border router traffic
48Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
49Asset Management
- ESnet Asset Management System tracks all ESnet
network and computing equipment throughout the
country - Approximately 270 assets at 50 locations in the
US are tracked in a Remedy database - Cradle-to-Grave asset movement tracking
- Received equipment is documented in Sunflower
(LBL property database) and Remedy - LBL Shipping Documents created electronically
- All assets tracked through carriers tracking
system - Set up and monitor Return Merchandise
Authorizations with vendors - Surplusing
50Asset Management
E.g. first 4 locations of 50(from Remedy
database)
E.g. AOA Hub
51Outline
- Forward
- ESnet science drivers
- 30 second tutorial on networking
- ESnet physical and logical infrastructure
- Not just one network
- Services for science collaboration
- ESnet is fairly unique
- ESnet is complex in several dimensions
- Operating critical science mission infrastructure
- Asset management
- Future directions
- Conclusions
52Potential Future Capabilities are Continually
Investigated
- One of the biggest current problems is upgrading
the site local loops so that sites are not
starved for bandwidth into the backbone
circuit capacity
53Potential Future Capabilities are Continually
Investigated
- Optical Metropolitan Area Networks (MANs) are
being investigated in the SF Bay Area and Chicago
Areas as an alternative to expensive local
carriers for site connections - An optical fiber ring is purchased or leased from
a fiber provider that can reach major sites (e.g.
SLAC, LLNL, SNLL, LBNL, and NERSC in the SF Bay
Area FNAL, ANL, and Starlight in Chicago area)) - A single connection is made from the ESnet core
ring to the local ring, which avoids local
telecomm carriers - Probably only feasible in major metropolitan areas
54Conclusions
- ESnet is an infrastructure that is critical to
DOEs science mission and that serves all of DOE - Focused on the Office of Science Labs
- Complex and specialized both in the network
engineering and the network management - You cant go out and buy this ESnet integrates
commercial products and in-house software into a
complex management system for operating the net - You cant go out and take a class in how to run
this sort of network it is specialized and
learned from experience - Extremely reliable in several dimensions
55The ESnet Team
William E. Johnston ESnet Manager
Mike Collins Network Engineering Services Group
Lead
Stan Kluz Network TechnicalServices Group Leader
Gizella Kapus Acting Project Administrator
KaRynn Kelly ESnet Administrator
56The ESnet Team Network Engineering Services
Group (NESG)
Joe Burrescia Deputy NESG lead Multicast Spectrum
Performance Centers
Yvonne Hines Peering Coordinator DNS
Assignments V4,V6 Addressing Publishing Stats
Kevin Oberman DNS Management Config
Management Eng Tools ESnet LAN support
Chin Guok Statistics Metrics Performance
Centers MRTG Web Servers Eng Tools
Joe Metzger Eng Web Servers Eng Data
Base Dashboard Eng Tools
Mike OConnor Multicast Spectrum Eng Tools ESnet
LAN support
57The ESnet Team Network Technical Services Group
(NTSG)
John Paul Jones Network On call ESnet LAN
Mark Redman Network On call Config. Lab
Chris Cavallo Network On call Assets mgt
Jim Gagliardi Network Support Team lead Network
On call
Clint Wadsworth Network On call collaboration
Dan Peterson Network On call, MSWin security
Scott Mason Assets mgt, Windows
58The ESnet Team Unix, Database, and Collaboration
Support
John Webster UNIX team leader
Don Varner UNIX, AFS, security
Ken Pon UNIX, security
Mike Pihlman Collaboration
Roberto Morelli Systems Design
Marcy Kamps Web, Oracle, Remedy
59The ESnet Team PKI Project
Tony Genovese Project Lead
Mike Helm Security Architect
Dhivakaran Muruganantham (Dhiva) Software
Engineer