Reliability in SuperJANET 5 - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Reliability in SuperJANET 5

Description:

... 5. Roland Trice. SuperJANET 5 Project. r.trice_at_ulcc.ac.uk ... Reliability Roland Trice. Transmission technology Henry Hughes. Architecture Duncan Rogerson ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 36
Provided by: Roland7
Category:

less

Transcript and Presenter's Notes

Title: Reliability in SuperJANET 5


1
Reliability in SuperJANET 5
  • Roland Trice
  • SuperJANET 5 Project
  • r.trice_at_ulcc.ac.uk

2
Introduction
  • Background to SJ5 project
  • Reliability issues
  • How might reliability be improved?
  • Funding
  • Reliability consultation
  • Homework

3
Background to SJ5 project
  • Requirements analysis Jeremy Sharp
  • Views from the community on
  • Reliability
  • Services
  • Bandwidth
  • Applications
  • Etc etc etc
  • Investigations
  • Carrier class routers
  • Reliability Roland Trice
  • Transmission technology Henry Hughes
  • Architecture Duncan Rogerson

4
Background to SJ5 project
  • Reporting in December
  • Network Strategy Workshop
  • JCN, JISC
  • Basis for funding provision
  • No CSR this time around
  • Cant assume how much or over what period

5
Background to SJ5 project
  • If funding is released
  • Procure new bandwidth in 2004
  • Procure routers in 2005
  • Deploy in 2005
  • Hopefully before clearing
  • Must by end of 2005 when SJ4 contract ends

6
Background to SJ5 project
  • Key features
  • Reliability and availability
  • Multiple service streams
  • End-end delivery
  • http//www.ja.net/SJ5

7
Work in progress
  • Meeting with UK MANs
  • Investigate ways in which UKERNA and RNOs can
    collaborate to increase reliability
  • Reliability questionnaire
  • Published on SJ5 web page
  • Responses coming in
  • The more the better
  • Informal investigations
  • Telco market
  • Carrier class routers

8
Reliability
  • Seen as key feature of new network
  • JANET necessary for core business activities
  • End-end reliability needed for emerging services

9
What do we mean by reliability?
  • Maximum availability
  • Works first time, every time
  • Minimum disruption
  • Everyone hates down time
  • Predictable performance
  • Any time of the day
  • Any application

10
What do we mean by reliability?
  • Everything works for everyone
  • From E-mail FTP to voice and video
  • Big institutions down to small colleges
  • Web browsing to Peta-byte file transfers
  • Difficult to provide with on-size-fits-all
    network

11
What has SJ4 achieved?
  • A huge improvement on SJ III
  • No user affecting faults on core
  • Few RN access link failures
  • Improvement plans to rectify problems
  • No congestion anywhere on the backbone
  • Into regional networks
  • To UK ISPs
  • To Europe
  • To the Internet

12
What has SJ4 achieved?
  • Freedom to develop new services
  • Not continually fire fighting
  • New services bring new problems
  • Market test
  • Needed to prove someone would sell us
  • 16 times more bandwidth
  • Flexible contract
  • Upgrade path
  • Improvement plans
  • Development infrastructure

13
Youre never alone with Schizophrenia
  • Network must be reliable
  • SLA failure will displease funding bodies
  • Failure of credibility with user community
  • We need to develop new services
  • Need to add value where ISP cant
  • Failure in credibility if we dont

14
Maximise Reliability
  • Choose stable OS and keep to it
  • Deploy ultra reliable hardware
  • Fault tolerant leased lines
  • Eliminate single points of failure
  • Strict change control
  • Avoid complexity
  • Keep your sticky mitts off it!
  • At the Core, RN and institution

15
Issues-1
  • Choose stable OS and keep to it
  • Need to deploy new features
  • Mix of features causes instability
  • Deploy ultra reliable equipment
  • Cost
  • Current generation of routers inadequate
  • Not carrier class

16
Issues-2
  • Fault tolerant leased lines
  • Cost
  • Diverse routing Vs Diversity of supplier
  • Sub-contracting
  • Duct swapping
  • Eliminate single points of failure
  • Costly
  • Adds complexity

17
Issues-3
  • Avoid complexity
  • Resilience and/or more adventurous services all
    increase complexity
  • Keep your sticky mitts off it!
  • Need to develop new services
  • At risk periods are unavoidable
  • Strict change control
  • Does slow down rate of change
  • Temptation to cut corners

18
Issues-4
  • Change is difficult to implement on an
    operational network
  • Multicast, QoS etc taking months or years to roll
    out
  • Problems during deployment will affect
    reliability

19
Squaring the circle
  • High reliability and leading edge facilities are
    mutually exclusive on a on-size-fits-all network
    infrastructure
  • Possibility that JANET fails to deliver the
    required reliability and the desired innovation
    unless the infrastructure is radically different

20
Multiple independent services
  • Commodity best-efforts IP service
  • Stable
  • Always there
  • Minimal at risk time
  • Stability, stability, stability

21
Multiple independent services
  • Offer greater range of leading edge facilities
  • Early adoption of emerging technology
  • Special application environment
  • Guaranteed latency and jitter
  • Very high bandwidth
  • Managed bandwidth

22
Multiple independent services
  • Will almost certainly need more attention
  • More frequent at-risk work
  • Less stability
  • Trickle down
  • New services can be brought into production
  • Once all the bugs have been ironed out

23
Multiple independent services
  • Not just about reliability
  • Allows greater exploitation of the network
  • Allows support of a diverse community
  • Needs of the many are not out-weighed by the
    needs of the few or the one
  • Needs of the few or the one are not stifled by
    the needs of the many

24
Provision
  • MPLS
  • TAG switching of different services
  • Still relies on on-size-fits-all network
  • Complex
  • Untried in our community
  • Not flexible enough
  • May have a place if no other way can be found
  • Potential use in legacy networks

25
Provision
  • Multiple virtual links from a Telco
  • Lambda or SDH services
  • Use optical muxes to apportion bandwidth
  • Could provide raw bandwidth without IP
  • Very high data rates available
  • 10 Gbit/s now
  • 40 Gbit/s shortly
  • Possibly by the time we procure

26
Provision
  • Dark or Managed fibre Fibre
  • As above, but we own and operate the fibre and
    optical kit
  • Long term cost benefit
  • If we can get away from 5 year procurement cycle
  • Absolute control
  • Maintenance issues

27
Routers, The Next Generation
  • NG routers 99.999 availability
  • Better hardware
  • More reliable
  • Very high capacity
  • Cunning features
  • Caching of routing and forwarding
  • ATM like QoS control
  • Virtual Routers
  • Modular software

28
In the real world
  • Core network very reliable, access links are the
    weakest link
  • Telco fault handling remains poor
  • Lies, damn lies and the art of parking
  • Duplication of links
  • RN to JANET
  • UKERNA RNOs will look at resilience
  • Institution to RN
  • Institutions need to talk to their RNO

29
In the real world
  • Institutions will have to fund duplicate links
  • More effective than shouting at a Telco
  • Institutional power problems still a significant
    issue

30
Funding, funding, funding
  • Everyone says they want more reliability
  • Not a free good
  • Who wants to pay for more reliability?
  • Ah well, err maybe in a few months.
  • Need to put a price on reliability
  • Cost to institution of a loss of JANET service
  • Indicate spending that mitigates risk
  • LAN or WAN or both or none? 
  • Results could influence the spend if not the
    purse

31
Reliability consultation
  • Available now on SJ5 web page
  • To determine how important JANET actually is to
    institutions and to users
  • To recommend ways in which reliability may be
    improved in a scaleable and affordable way
  • Cost benefit analysis to maximize value for money
  • Responses by mid November please

32
Key questions
  • Dependence on the network
  • How tolerant is your institution to outages?
  • Have you identified a cost of network down time?
  • Most responses dont give a figure
  • Investment priority
  • LAN - WAN
  • Risk Analysis
  • Resilience
  • Spending

33
Key questions
  • Balance between reliability and development
  • Regional Network Reliability
  • At-Risk sessions
  • Institutions Risk Analysis

34
Homework
  • Talk to your institution, UKERNA and your RNO
  • Respond to the questionnaire
  • If possible, with an institutional remit
  • Highly desirable
  • Risk analysis
  • Cost implications
  • Better reliability comes at a price
  • Cost benefit analysis
  • Smart spending

35
Questions
Write a Comment
User Comments (0)
About PowerShow.com