Master - PowerPoint PPT Presentation

About This Presentation
Title:

Master

Description:

Master s Thesis, Mikko Nieminen Espoo, February 14th, 2006 TROUBLESHOOTING IN LIVE WCDMA NETWORKS Supervisor: Professor Heikki H mm inen Background to the Study ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 28
Provided by: netlabTkk
Category:
Tags: master

less

Transcript and Presenter's Notes

Title: Master


1
Masters Thesis, Mikko NieminenEspoo, February
14th, 2006
  • TROUBLESHOOTING IN LIVE WCDMA NETWORKS

Supervisor Professor Heikki Hämmäinen
2
Background to the Study
  • The number of live WCDMA networks is growing
    quickly.
  • The first commercial Third Generation Partnership
    Project (3GPP) compliant network, J-phone, was
    opened in December 2002.
  • By October of 2005, there were 80 live commercial
    WCDMA networks and the amount of subscribers was
    nearly 40 million. By that time, around 140
    licenses had been awarded for WCDMA, the current
    WCDMA license holders having more than 500
    million subscribers in their Second Generation
    (2G) networks.
  • Especially in Europe and Asia, WCDMA network
    deployment after successful field trials and
    service launches has entered a new critical
    stage the phase of network optimisation and
    network troubleshooting.

3
Research Problem
  • As the amount of WCDMA subscribers quickly
    increases, operators and equipment vendors are
    facing big challenges in maintaining and
    troubleshooting their networks.
  • We may raise the question of how one can
    efficiently narrow down the root causes of the
    problems when there is a huge amount of
    subscribers and traffic in a live WCDMA network.
  • What are the principles of examination of the
    fault scenarios and narrowing down the problem
    investigation into logical manageable pieces?
  • Which are the tools and methods that are in
    practice used in WCDMA network troubleshooting
    today?
  • In order tackle these questions and challenges,
    this Thesis presents a Framework for
    KPI-triggered troubleshooting in live WCDMA
    networks.
  • The applicability of the Framework is
    demonstrated by applying it to a selection of
    real troubleshooting cases that have occurred in
    commercial WCDMA networks.

4
Scope of the Study
  • This study concentrates on the KPI-triggered
    problems in live WCDMA networks.
  • In general, the faults can be classified into
    three categories
  • Critical, which are emergency problems that
    require immediate actions,
  • Major (which we refer in this study as
    KPI-triggered problems)
  • Minor which do not affect the services of the
    network.
  • The viewpoint of is from the equipment vendors
    side, the main objective being to create
    guidelines for troubleshooting experts and
    technical support personnel of WCDMA network
    manufacturers in order to perform troubleshooting
    and narrow the problems down following a defined
    logic.
  • This Thesis mainly concentrates on WCDMA network
    troubleshooting from a Radio Access Network
    perspective. The reasoning behind this approach
    is that the UTRAN covers most of the WCDMA
    specific functionality and intelligence, and
    therefore brings the majority of the
    troubleshooting challenges also.

5
Research Methods
  • This Thesis is mainly based on the study of
    various technical specifications and interviews
    of WCDMA network troubleshooting experts.
  • The main literature sources are the 3GPP
    specifications of release 99, since the majority
    of the live WCDMA networks were based on 3GPP
    release 99 during the writing of this Thesis.
  • It can be noted that 3GPP release 4 networks are
    currently gaining foothold in the live WCDMA
    networks. However, there are only minor
    differences in the Radio Access functionality of
    the afore-mentioned two 3GPP specification
    releases.

6
Structure of the Thesis
  • Introduction to WCDMA Networks
  • UTRAN Protocols
  • Call Trace Analysis
  • Key Performance Indicators
  • Framework for KPI-Triggered Troubleshooting
  • Cases from Live WCDMA Networks

7
WCDMA network architecture
PSTN
INTERNET
GGSN
GMSC
AuC
CORE NETWORK
HLR
EIR
SGSN
MSC/VLR
UTRAN
RNC
RNC
Node B
Node B
Node B
Node B
cell
cell
cell
cell
cell
cell
cell
cell
UE
ME
USIM
8
UTRAN architecture
UTRAN
Core Network (CN)
Iu-CS
Node B
3G MSC
RNC
Node B
Uu
Iur
Iub
Node B
SGSN
RNC
User Equipment (UE)
Node B
Iu-PS
9
UMTS Bearer Services
10
Summary of Protocols (CS user plane)
Iub
Iu
Uu
CS application and coding
CS application and coding
RLC
RLC
MAC
MAC
Iu-UP protocol
Iu-UP protocol
WCDMA L1
WCDMA L1
FP
FP
AAL2
AAL2
AAL2
AAL2
ATM
ATM
ATM
ATM
PDH/SDH
PDH/SDH
PDH/SDH
PDH/SDH
RNC
Node B
UE
MSC
11
Summary of Protocols (UE control plane)
Iub
Iu
Uu
NAS
NAS
RRC
RRC
RANAP
RANAP
RLC
RLC
SCCP
SCCP
MAC
MAC
MTP3b
MTP3b
SSCF-NNI
SSCF-NNI
WCDMA L1
WCDMA L1
FP
FP
SSCOP
SSCOP
AAL2
AAL2
AAL5
AAL5
ATM
ATM
ATM
ATM
PDH/SDH
PDH/SDH
PDH/SDH
PDH/SDH
RNC
Node B
UE
CN
12
Overview of WCDMA Call Setup
MT Call
MO Call
RRC Connection Establishment
Radio Access Bearer Establishment
Paging
User Plane Data Flow
13
RRC connection establishment (DCH)
UE
RNC
Node B
1. RRC CONNECTION REQUEST
2. Admission Control
3. RADIO LINK SETUP REQUEST
4. Start RX
5. RADIO LINK SETUP ESPONSE
6. ESTABLISH REQUEST
7. ESTABLISH CONFIRM
8. UPLINK DOWNLINK SYNC
FP
FP
9. Start TX
10. RRC CONNECTION SETUP
11. L1 SYNCH
12. RL RESTORE INDICATION
13. RRC CONNECTION SETUP COMPLETE
14
Protocol Analysers
Company Product Home Country
Nethawk 47 3G Analyser Finland
Agilent 48 Signaling Analyzer United States
Tektronix 49 K15 United States
Radcom 50 Performer Analyser Israel
Acterna 51 Telecom Protocol Analyzer United States
15
RRC Connection Events and KPIs
UE
RNC
CN
RRC CONNECTION REQUEST
Event 1
Event 1
RRC_CONN_ATT_EST
Setup phase
incremented
RRC CONNECTION SETUP
Event 2
RRC_CONN_ATT_COMP
Event 2
incremented
Event 3
RRC_CONN_ACC_COMP

incremented
RRC CONNECTION SETUP COMPLETE
Event 3
Event 4
RRC_CONN_ACT_COMP

incremented
Event 4
IU RELEASE COMMAND
Sum of RRC_CONN_STP_COMP
x 100
RRC Setup Complete Rate
Sum of RRC_CONN_STP_ATT
Sum of RRC_CONN_ACT_COMP
x 100
RRC Retainability Rate
Sum of RRC_CONN_ACC_COMP
16
RRC connection Phases
Active
Access
Setup
Phase
Setup
Access
Active
Complete
Complete
complete
Success
Access
Active
Release
Active
Failures
RRC Drop
Attempts
Access Failures
Setup Failures, Blocking
17
Other WCDMA network KPIs
Sum of RAB_STP_COMP
x 100
RAB Setup Complete Rate
Sum of RAB_STP_ATT
Sum of RAB_ACC_COMP
x 100
RAB Establishment Complete Rate
Sum of RAB_STP_ATT
Sum of RAB_ACT_COMP
x 100
RAB Retainability Rate
Sum of RAB_ACC_COMP
Sum of RAB_ACC_COMP
x 100
CSSR
Sum of RRC_CONN_STP_ATT
Sum of RAB_ACT_COMP
x 100
CCSR
Sum of RRC_CONN_STP_ATT
18
Fault Classification
Fault Class Description Examples
A-CRITICAL Total or major outages that are not avoidable with a workaround solution. Critical (emergency duty contacted) problems severely affect service, capacity/traffic, billing, and maintenance capabilities and require immediate corrective action, regardless of time of day or day of the week as viewed by the operator. System restart, all links down Simultaneous restarts of active computer units More than 50 per cent of traffic handling capacity out of use Subscriber related network element functionality is not working
B-MAJOR The problem leads to degradation of network performance or the fault affects traffic randomly. Major problems cause conditions that seriously affect system performance, operation, maintenance, and administration and require immediate attention as viewed by the operator. The urgency is less than in critical situations because of a lesser immediate or impending effect on system performance, customers, and the customers operation and revenue. Capacity/quality related functionality is not working as supposed to Problems seriously affecting end user service, but avoidable with a workaround solution Configuration changes (network, HW, and SW) are not working as supposed to Subscriber related functions are not working completely Performance measurement, alarm management or activation of a new feature fails Single restart of computer units
C-MINOR Minor fault not affecting operation or service quality Other problems that the operator does does not view as critical or major are considered minor. Minor problems do not significantly impair the functioning of the system or affect the service to customers. These problems are tolerable during system use. Failures not seriously affecting traffic Errors in operating commands syntax Cosmetic errors in operational commands or statistics output Minor errors in documentation
19
Framework for KPI-Triggered Troubleshooting
  • Framework is designed for investigating and
    soelving B-MAJOR level i.e. KPI-triggered
    faults
  • Before applying the Framework
  • The general alarm status of the network has been
    checked. No clear network alarms pointing to the
    root cause of the fault can be detected.
  • Traces from external interfaces of RNC have been
    taken with a protocol analyser in order to record
    the fault scenario. Also RNC internal trace has
    been taken when the fault took place.
  • The basic fault scenario has been analysed and
    clarified.

20
Is the problem new in the operator network?
No
Yes
Perform simulation of the fault in test bed.
Does the fault still occur?
New SW, HW, parameters, UE model or feature
introduced?
No
Yes
No
Yes
Yes
Is the fault operator specific?
Perform simulation of the fault with reference
conditions. Does the fault still occur?
No
No
Yes
Has average network load increased significantly
and/or does the problem occur at a specific time
of day?
Analyse and investigate the differences between
the working and faulty conditions.
Yes
No
Use RNC Performance Tester to generate load in
test bed and perform analysis.
Analyse the traces. Investigate fault scope.
Transmission specific
Node B specific
Service specific
RNC specific
CN specific
Country specific
UE specific
Analyse network element and interface specific
alarms, parameters, capacity, logs and traces.
Take specific actions depending on problem
scope (refer to detailed Framework notes).
In case of MVI environment, check IOT results and
contact foreign vendor. Investigate own vendors
default parameters and compare implementation
againts 3GPP specifications. Compare own
default parameters with other default parameters
of other vendors. Execute air interface protocol
analysis and drive tests.
21
Case Increased AMR call drop rate
  • A decrease in RAB Retainability Rate KPI for AMR
    telephony service was experienced during the last
    three months in an operator network.
  • The decrease was around 2 on each RNC compared
    to the time when the network was performing well.
    Actions that had already been taken with no
    positive effect
  • Soft reset for all Node Bs and for all RNCs
  • Hard reset and re-commissioning of Node Bs
  • Alarms checked and no major alarms found

22
Case Increased AMR call drop rate
Is the problem new in the operator network?
I.
Yes
New SW, HW, parameters, UE model or feature
introduced?
II.
Yes
Perform simulation of the fault in reference
conditions. Does the fault still occur?
III.
No
Analyse and investigate the differences between
the working and faulty conditions.
IV.
23
Case Increased AMR call drop rate
  • Solution
  • The short term solution was that the parameter
    for planned maximum downlink transmission power
    of all the Node Bs in the operator network was
    changed to the default value of 34 dBm. In this
    way, the problem disappeared in the operator
    network.
  • The long term solution was to implement a fix of
    the bug into the next software release of the
    Node B.

24
Results
  • As a result of thorough research conducted for
    this Thesis, a Framework for KPI-triggered
    troubleshooting for live WCDMA networks was
    developed.
  • The Framework is mainly targeted for WCDMA
    network equipment vendors, to help them in
    solving major service affecting faults occurring
    in the live WCDMA networks of today.
  • Troubleshooting cases from live WCDMA networks
    were solved using the Framework developed, in
    order to verify the results and test the
    applicability and practicality of the Framework.

25
Assessment of the results
  • The applicability and relevance of the
    troubleshooting Framework was tested against
    three different fault cases from live WCDMA
    networks.
  • The results were fairly promising since all the
    cases were successfully solved by utilising the
    Framework. The Framework was found to be quite
    practical and suitable for solving KPI-triggered
    problems in live WCDMA networks.
  • However, it must be taken into account that the
    Framework was tested with a limited number of
    cases, because of time and resource limitations.
    If more extensive testing and verification with a
    large number of cases would be applied, there is
    a possibility that optimisations and improvements
    to the Framework could be done.
  • Still, the basic logic of the Framework was
    proven with reasonable relevance. The results
    presented in this study can be easily tested in
    the future against a number of cases in order to
    verify the results with more extensive
    statistical reliability.

26
Exploitation of the results
  • The results of this study will be used as source
    material in the development of UTRAN
    troubleshooting competence development and
    advanced learning solution creation, targeted for
    troubleshooting experts and customer support
    engineers of one of the leading WCDMA network
    equipment vendors.
  • Also, the results of the Thesis will be used as
    an input in creation of customer documentation
    for UTRAN troubleshooting.
  • There is also an intention to further test the
    relevance and reliability of the results of this
    Thesis by applying it in the 24/7 RAN technical
    support operator service of the equipment vendor
    in question.

27
Future Research
  • The significance of Performance Indicator based
    troubleshooting is increasing continuously in
    live WCDMA networks.
  • Once the PI and KPI specifications become more
    mature, more extensive study of the most relevant
    Performance Indicators used in WCDMA network
    troubleshooting is essential.
  • Also, there is a need to develop a Framework and
    logic for solving emergency problems in WCDMA
    networks.
  • As the growth of complexity of telecommunication
    networks increases, effective and efficient
    troubleshooting procedures are essential in order
    to manage the diversity of network technologies
    and the increasing quality requirements of the
    operators.
Write a Comment
User Comments (0)
About PowerShow.com