Title: Henry Starzynski
1Henry Starzynski Network Operations
Support Global Network Mgmt Centre Bell Canada
January 2015
2- Henry Starzynski Manager, Global Network
Management Centre - Graduated from the University of Waterloo in
1982 with Bachelor of Mathematics - (Computer Science)
- Post graduation, worked for a computer time
sharing company called Datacrown, - which become Canada Systems Group, then
SHL-Systemhouse , now CGI - Ive been with Bell 30.5 years! (yes there have
been LOTS of changes since I started!) - Started out working on network design tools for
services called Datapac and Megastream - Moved to our network management centre taking
care of Datapac, managing the 7/24 console - then Frame Relay (Hyperstream) support
- Today, I continue with legacy network support,
PLUS bring in new business for our centre, - support our computers
(PCs) and handle international escalations - I have a life outside of Bell too! Im involved
in the local community with Scouts Canada - so, when you are free of
University life, dont forget to - be involved in your community as well. You have
lots of energy and knowledge that can - help make local communities, where ever you end
up, much better! - Dont forget, when you leave Carleton, learning
never ever stops! Keep your brains active,
technology - is continually changing
-
3Bell Canadas GNMC
- GNMC Global Network Management Centre
- One of the worlds first Data Network Management
Centres - Operating locally in Ottawa, serving Bell Canada
customers globally
4Bell Canada GNMC
- A bit about who were are
- Involved in managing data networks in Canada
since 1974, globally since 1992 - Originally - the National Data Network Control
(NDNC) for domestic (Canada only) core data
networks Dataroute, Datapac (packet switching) ,
Megastream (Pt-Pt T1), Hyperstream (frame relay),
Canadian ATM Gateway networks - Expanded to include private networks (Lotto
Quebec) and VPN clouds - Started internationally with Financial Networks
Associates (FNA consortium of 8 countries )
network in 1991 (Alcatel based network) - Evolved into Global Network Management (GNMC) at
the individual customer circuit level - Today, we serve as International Help Desk/SPOC
(single point of contact) for international data
circuit troubles going OUT of Canada (with the
exception of - Canadian government circuits, which are
handled by a separate group)
5 Bell Canada GNMC
- Main Focus Areas
- Single Point of Contact (SPOC) for international
customer data circuits - VPN Managed Services (MPLS) and support of
private or virtual private network clouds and
routers (LAN) - Core Network Management (WAN) of legacy data
networks (DatapacPacket Switching, Frame Relay) - Technical Support on existing legacy networks
- Surveillance of 2 major customers networks
internationally - GNMC is involved in major processes of Network
Management - Fault Management
- Configuration Management (Provisioning)
- Performance Management Change Management
- Security Management
6 Network Management
- Like any industry, we toss around lots of BUZZ
WORDS - What do all those terms mean??
- WANs
- Clouds
- OSSs
- Network Management
- SPOC
- Why do we do network management customer
management? - Why is it important?
-
WELL - lets start
WHAT IS A NETWORK?
7 What is a Network ??
A Network means something different to everyone
For example, a
network can be ..
- LAN (Local), WAN (Wide) MAN (Metropolitan),
CAN (Campus) Area networks - Point to Point network - connecting two sites
regardless of distance - The CLOUD - the service providers network
the infrastructure, sometimes - termed the Public Network
- The NET - the ubiquitous network
- The PSTN Public System Telephone Network
- Wireless network
- Home Network
- A VPN a Virtual Private Network
- A social network!
-
- A NETWORK MEANS DIFFERENT THINGS TO
DIFFERENT PEOPLE - BUT whatever your definition, all networks do the
same thing! -
8 What is a Network ?
- A standard definition of a network we will
use is the following - A set of elements or NODES linked together to
provide paths to transmit information,
(data, voice, video) from one location to
another. - A critical tool which allows businesses
people to operate and communicate
- When it is all boiled down, All information is
data, and it travels over a network. - Successful networks are managed
-
9 Examples of Data Networks
- Transport Networks (Sonet, DS3, DS1, Fibre, IP
core) the BIG infrastructures - Circuit Switched (Public Switched Telephone
Network) - Dedicated (Point to point)
- Packet/Frame/Cell (legacy services)
- IP (Internet/ Intranet)
- Local Area Networks, in the home, office, or
around the campus. - Private (TV, Radio, Financial, Lottery) or
Virtual Private Networks (VPNs) - Wireless
10 Network Characteristics
- Common characteristic of all networks is
- the transmission of DATA (information, etc.)
- Some type of information (i.e. - data) is being
transmitted from one person/computer/location to
another, for business, pleasure, research, etc. - In todays world, we take data communications
over networks for granted - it is there,
reliable, fault tolerant, and it NEVER fails. - We use it every day, it is part of our daily
routines, part of our life! - We expect connectivity!
11What then - is Network Management and why is
it important ?
- Network management has 5 main processes
- Fault Management
- Configuration Management (Provisioning)
- Accounting Management
- Performance Management (including Change
Management) - Security Management
- Bruce Deachman The Ottawa CitizenSunday, March
20, 2005 - In 1994, Nicholas Negroponte, founder of MIT's
Media Lab, predicted one billion people would be
using the - Internet by the year 2000. What he failed to
point out, was that most of them would be trying
to get U2 tickets. - At least that's how it must have felt for
countless fans who were unable to snag tickets to
the Bono-led, - Irish rock band's Nov. 25 Corel Centre show
yesterday morning, as technology failed to keep
pace with - overwhelming demand, leaving old-fashioned
overnight campers the happiest of all
12Question! What is the latest current estimate of
the number of internet users in the world?
13http//www.internetworldstats.com/stats.htm
14(No Transcript)
15- Blasts
from the Past!! - ROOT CAUSES OF BLACKOUTS AND THEIR REMEDY
- The electric power transmission system of the
United States is seriously deficient. - Experts generally agree that fixing this system
to an adequate level would take - many years and cost of tens of billions of
dollars. But the root causes of the recent - Blackout of 2003 can be solved in a relatively
short time and at a much more - reasonable cost.
- The root causes of the present problems are
- A totally outdated reliability philosophy and
- Inadequate real time monitoring of the
transmission grid. -
- Isnt the power grid a network too? Of course!
Electricity is just a form of data!
16http//www.speedmatters.org/blog/archive/fcc-veriz
on-at-fault-for-network-failures-of-2012-derecho/
.UPgdWh1lGQG
In June 2012, large parts of the Midwest and
Middle Atlantic were, without warning, hit by a
destructive rain and windstorm called a derecho.
It left in its wake 22 dead, hundreds of injuries
and millions of people without power or
communications. Today, the FCC released a lengthy
report prepared by its Public Safety and Homeland
Security Bureau that looks at the communications
outages that followed from the derecho, and made
recommendations to avoid or reduce future
failures.
FCC Commissioner issued a statement reinforcing
the findings and recommendations, and commenting
on the service breakdowns "Tragically, many of
these were avoidable interruptions involving a
lack of back-up power to central offices or
failures of the service providers' monitoring
systems... Carriers should test their networks
and ensure that plans are in place in case of an
emergency. It is time for an honest accounting of
the resiliency of our nation's network
infrastructure in the wireless and digital age."
In computer networking Resiliency is the
ability to provide and maintain an acceptable
level of service in the face of faults and
challenges to normal operation.1 Network
resilience touches a very wide range of topics.
In order to increase the resilience of a given
communication network, the probable challenges
and risks have to be identified and appropriate
resilience metrics have to be defined for the
service to be protected
17 Why Network Management?
- From a network providers viewpoint
- Manage network resources equitably to ensure
users can establish communications
quickly reliably - Ensure information is transferred with original
quality, integrity, and securely - Operate a high performance, reliable, cost
effective network that meets customer/ - business/organizational needs and requirements
- Plan and implement measures to prevent or
mitigate interruptions of service - degradation
- Make for the network provider and its
shareholders - Gain market share for the network provider
- At Bell Canada, networks are the building
blocks of our own business they are why we
exist!
18 Why Network Management?
- From the customers viewpoint
- Ensure information is transferred with original
quality, integrity, and securely - Obtain service at best cost/service/value
combination - To ensure a customers business operates with
minimum downtime, in order to meet - the requirements of its customers
- Meet regulatory, legal, safety requirements
- For a customer, networks are critical
- For businesses, for their operations.
- For the general public, so we can communicate,
get money, do our assignments, - talk .. BE CONNECTED
19(No Transcript)
20Network Management Poses Endless Challenges by
Willie Schatz If network managers are in accord
about anything, its that they have a lot more
tasks to do than resources to handle them. The
fundamental roles of a network administrator are
to provide network connections for computer
equipment and to ensure availability and
performance of network communications. But
thats only the beginning. The administrator must
set up and manage hardware and software
solutions, enabling servers, clients, printers
and other peripherals to communicate. He or she
also is responsible for providing users the
highest quality server functionality, which means
uninterrupted, optimum network availability and
performance. This same individual also must plan
so any changes required in the network conform
with changes in the larger enterprise system.
People really think network management is
easier than it really is.
21Network Management Processes
- There are five processes involved in network
management - Configuration Management Provisioning
- Programming network elements to communicate with
each other and user equip. - User datafill to make their service functional
- Copying critical (non default) network
provisioning parameters to storage in - offline in databases
- Ensuring billable parameters/features are
updated in related billing systems - Providing dumps, downloads, or application
program interfaces (APIs) to other - downstream systems
- Why is Configuration/Provisioning management
important? - Users want their service when it is ordered (on
due date) - Users want to get the options they pay for
- The network provider needs to ensure their
service is billed
22Network Management Processes
- Fault ManagementService Assurance
- Surveillance - proactive - alarms/traps from the
network that indicate major problems - Isolating problems - reactive - when users have
troubles - Having clearly defined escalation procedures -
how to prioritize troubles - Providing customers with timely and honest
status on problems - when will it be fixed? - Performing analysis on failures for trends, root
cause - Service Assurance is .. REAL-TIME surveillance,
control , and analysis of a - network, with the objective of ensuring maximum
use of network resources , particularly - when it is under stress due to traffic overload
or failure conditions.
23(No Transcript)
24Network Management Processes
- Performance Management
- Performance measures can be internal (for the
provider), regulated (CRTC), or - to assist the customer (how is my network
performing) - Network performance (Mean time to repair,
Network availability) are standard - metrics used in the industry, and are often
basis for service level agreements - Customers may require information on their
traffic patterns - are they - paying for bandwidth they dont require, or is
their network overloaded? - Many customers want guarantees of performance
a Service Level Agreement (SLA) - in order to ensure they are getting the
performance they pay for. - A SLA may include the following
- Network Availability
- Frame/Cell/Packet delivery
- Mean time to Repair
- Penalty clauses for non-performance
- Delay metrics
25(No Transcript)
26Network Management Processes
- Change Management
- Scheduling downtime / maintenance activities
(new software, network upgrades) - with users (notification, release or emergency)
- Ensuring software levels are compatible with all
network components - Keeping the customer informed of planned
service interruptions is critical - Networks are in need of periodic maintenance for
software or hardware upgrades, - etc. In a 7x24 world, unscheduled downtime can
mean - loss of revenue
- legal liability
- threats to public safety.
27FROM CHANGE MANAGEMENT PLANNED OUTAGE
Foreign-Tel COMMUNICATIONS Dept. GNMC
Phone 1-555-868-7883 Fax
1-555-868-7822 Please respond to the
following Email tcsccip_at_foreigntelcommunications.
com ForeignTel Communications would like to
inform you that the Change Management activity
will be performed as indicated below ___________
__________________________________________________
________ Outage POM041793
/ POT356369 Your ref. Description
DISREGARD OUTAGE NOTICE//THIS IS NOT SERVICE
AFFECTING//WE ARE ADDING
BACKBONE CAPACITY
PORTLAND-SANTA CLARA DURING THIS PERIOD,
NETWORK WILL BE IN HAZARDOUS
CONDITION. WALL NOC WILL
CLOSELY MONITOR THE NETWORK AND ANY
ALARMS ON IT Scheduled Planned Start
Date (UTC) february 16, 2014 150000 Scheduled
Planned End Date (UTC) february 24, 2014
030000
28Related Network Management Activities
- Co-ordination with other Carriers and Agencies.
- No one carrier can route traffic everywhere
on the planet. Strategic alliances and
co-operation amongst carriers is essential. - Dynamic Controls.
- Can traffic be rerouted around failures or
congestion? Is this automatic or manual? -
- Disaster recovery planning.
- Could it happen to you? What would you do in the
event of a disaster? - Security
- Who has access to the network infrastructure?
Can it be hacked? Ensuring one customers data
does not go to another customer.
29- Security Management
- The goal of security management is to control
access to network resources according to local
guidelines so that the network cannot be
sabotaged (intentionally or unintentionally) and
sensitive information cannot be accessed by those
without appropriate authorization. - Security management subsystems work by
partitioning network resources into authorized
and unauthorized areas. - They identify sensitive network resources
(including systems, files, and other entities)
and determine mappings between sensitive network
resources and user sets. - They also monitor access points to sensitive
network resources and log inappropriate access to
sensitive network resources.
30ATT Customer Info Hacked By TSC
Staff8/29/2006 905 PM EDT ATT late Tuesday
said that hackers broke into a computer system
and accessed personal data, including credit
card information, from thousands of customers who
had purchased DSL equipment from the company's
Web store. Kaspersky says Web hack 'should not
have happened' 02/09/2009 It's the worst thing
that can happen to a computer security vendor
This weekend, Moscow's Kaspersky Lab was
hacked. A hacker, who identified himself only as
Unu, said that he was able to break into a
section of the company's brand-new U.S. support
Web site by taking advantage of a flaw in the
site's programming. http//www.csoonline.com/art
icle/706400/10-hacks-that-made-headlines
31 Network Management Centre Functions
- 7 x 24 operation - its more than a buzzword.
- Operations Support Systems for provisioning,
change management, surveillance, trouble - tracking, customer records
- Subject experts/access to engineering support
personnel or labs - Multiple diverse communications channels
- Situation (War) room
- Secure and Independent Power Supply
- Access to Information Databases
- Contact information for support resources (level
1,2 3 support, vendor support) - Secure location
- Fully redundant backup location
-
-
32 When Disaster strikes!
- If something will go wrong .. It will ..
- Ice Storms (1998 2013)/Hurricane
Katrina/Sandy other natural disasters - Toronto Simcoe Central Office fire July 1999
- Power plant failures
- Hackers and viruses (SQL Worm)
- September 11/terrorist attacks
- All of these test the plans of a network
provider. - Are contingency plans in place? Have they been
tested or gathered dust for 5 years? - Is there an escalation chain of command?
- Are there agreements with other
suppliers/vendors/competitors? - What contingencies are in place to get critical
services restored as quickly as possible - When service is lost, the prime objective, after
immediate human safety, is the - restoration of service
-
33 From July 1999 TORONTO - Phones stopped ringing
in several major cities in Canada on Friday after
an explosion caused a major system failure at a
Bell Canada building in Toronto. The failure
knocked out phone lines, most cell phones,
internet services and bank machines in downtown
Toronto. Cantel and digital cell phones appear to
be working. Police report 911 emergency systems
are working, but the police are urging people to
use these systems only for real emergencies. The
failure was caused by an explosion on the fourth
floor at the downtown bell centre at around 800
am. One person was reportedly injured.
Immediately after the explosion, battery powered
backup systems kicked in. But they ran out of
power a few hours later. The Toronto Stock
Exchange is back up and running after it
suspended trading briefly but brokerages are
having trouble communicating. Phone systems in
Ottawa and Montreal and as far away as Halifax
and Vancouver have also been affected as calls
that normally routed through Toronto are
rerouted through other cities. Bell Canada says
it hopes to have services restored by
midafternoon. The Globe and Mail Published Thurs
day, Oct. 10 2013, 1118 AM EDT Rogers
Communications Inc. said a software glitch
created a big spike in signalling traffic that
caused one of the worst wireless network outages
in the companys history. Canadas largest
wireless carrier determined that root cause on
Thursday roughly 18 hours after implementing a
fix that restored voice and text services for
customers across the country. DISASTERS CAN
HAPPEN? How will your network provider handle the
trouble?
34- Another aspect of Network Management is Planning
- A carrier will have a plan for a disaster
situation, - as well anticipating potential issues
- Examples of planning for potential issues
include - Y2K
- more recently, the change in dates for Daylight
- Savings Time
- Other various clock rollover issues
- A carrier may also do periodic disaster
simulations - to test the response of various groups as well
- as procedures
-
35SPOC Function
- What is a SPOC?
- In Bell Canada, the GNMC is the Single Point of
Contact (SPOC) for all Fault Management and
Change Management between Canadian Help Desks and
Test Centres and all the global carriers that
Bell uses to provide international reach for our
customer circuits - SPOC for all other carriers to get their issues
fixed within Canada - One door for all trouble management into or out
of Canada - Avoids having many different groups learn the
processes for dealing with each of the carriers,
or the carriers having to learn about all the
various ops centers within Canada - Provides flexibility to move quickly and
customize for customer reasons, with centralized
expertise - As a SPOC, we get to compare service levels
provided by different global carriers and use
this info to get better performance
36 Operational Support Systems
- Successful network management uses standardized
protocols or vendor-specific - mechanisms to transmit alarms and commands
- (e.g. Simple Network Management Protocol)
- Operational control data can be transmitted over
conventional data networks, - over the same network (inband), or over
another network (out of band). - The systems which receive alarms, allow for
network configuration, troubleshooting, - and control is commonly called Operational
Support Systems (OSS). - OSS may be more than 10 times the cost of the
network infrastructure! - OSSs may consist of Workstations, Databases,
network elements, scripts, provisioning - systems, security systems, offline databases
and billing systems. - Without a good OSS structure, a great network
infrastructure will fail. The network - objectives cannot be met without this.
37 Operational Support Systems
- No one OSS does it all - if fact, many OSSs are
required, and these must interact - with each other. This is typically via
Application Program Interfaces (API) or - some standard format for information exchange.
- The interaction can be simple - or complex.
Often, simple format changes in one - OSS will impact many other downstream OSSs.
- Remember where the money is spent - Not on the
network infrastructure, but on - the systems that make the network run.
- The following diagram shows a SAMPLE interaction
between various systems.
38 Sample Operational Support Systems
Fault Mgmt/ trouble shooting OSSs
Test Centres, NDNC
BILLING
BILLING FILES
BILLING FILES
BILLING SYSTEM (Customer receives bill for
service/usage)
Call detail/ usage OSS
Billing OSS
BILLING
RECS
PROV
ORDER INFO
Recs
ORDER ENTRY/ Assignment system
Order system
Network Provisioning system (Customer gets
service)
NETWORK Elements
SNMP TRAPS
Customer and assignment dumps (feed other OSSs)
CUSTOMER
Fault
ORDERS
PROV RECS
Cust.. Stats Data
Mgmt OSS
SERVICE
Trouble
Collection Sys.
ALERTS
Ticket system
ALERT DISPLAY
Surveillance Centres
Telco local assignment system
Change Mgmt
39 Metrics Key Performance Indicators
- Each network needs some means of measuring its
success, and to see where - improvement can be made. Public networks may
be regulated. Metrics may be stipulated - in Service level agreements (SLAs) between
provider and customer - To the end user/customer, the most critical
metrics are the following - Mean time to repair (MTTR)
- Network Availability ((Total available
time-total downtime)/(Total avail. Time)) - Quality of Service (QOS)
- round trip delay
- Network congestion/blocking
- frame/packet/cell loss
- repeat failures
- To the network provider, the following are
important metrics - Network Availability
- EBITDA (Earnings Before Interest Taxes
Depreciation Amortization) - Cost / Revenue (return on investment)
- Market Share
- Network capacity
40 Metrics
- To the shareholder the following are important
- Dividend
- share price
- Return on Investment
41 42 Summary
- Networks can be simple, or extremely complex and
mission critical - Network quality , reliability, diversity, and
low cost are essential - The operation of a high quality reliable, cost
effective network requires - effective Network Management Centre(s), along
with skilled people and good support - tools (operational support systems)
- As networks continue to evolve, customers will
manage more and more of their own - networks.
- Challenges for the future include global
coverage, scaling for growth, - new technologies, telco mergers, acquisitions,
failures - an industry always in flux. -
43(No Transcript)