Title: Application of Fault Injection to Globus Grid Middleware
1School of Computing FACULTY OF ENGINEERING
Application of Fault Injection to Globus Grid
Middleware
Nik Looker Jie Xu University of Leeds, Leeds.
LS2 9JT, UK Tianyu Wo Jinpeng Huai Beihang
University, Beijing 100083, PRC
1
2A Historical Perspective
3Dependability Security
- To understand dependability it is important to
understand the three main concepts that it
utilises - Attributes
- Measurements of how Dependable and Secure a
system is - Threats
- Things that may affect the Dependability and
Security of a system - Means
- Ways of increasing the Dependability and Security
of a system
4Attributes
- Availability
- The probability that a service is present and
ready for use - Reliability
- The capability of maintaining the service and
service quality - Safety
- The absence of catastrophic consequences
- Confidentiality
- Information is accessible only to those
authorised to use it - Integrity
- The absence of improper system alterations
- Maintainability
- To undergo modifications and repairs
5Threats
- Fault
- A fault is a defect in a system
- Error
- An error is a discrepancy between the behaviour
of a system and its specified behaviour within
the system boundary - i.e. it enters an unspecified state
- Failure
- A failure is an instance in time when a system
displays behaviour that is contrary to its
specification at the system boundary
6Fault-Error-Failure Chains
- As a general rule
- A fault, when activated, can lead to an error
- An error is an invalid state
- An invalid state generated by an error may lead
to either another error or a failure - A generated error can be treated as another fault
- A failure is an observable deviation from the
specified behaviour at the system boundary
7Means
- Dependability means are ways of breaking
fault-error-failure chains. - Four main classifications
- Fault Prevention
- Fault Removal
- Fault Forecasting
- Fault Tolerance
8Fault Injection
- Fault Injection
- MTBF may be very large
- Attempt to speed up this process by injecting
faults - Cause the execution of seldom used control
pathways within a system - Either
- A failure may occur
- Systems fault tolerance mechanism will handle
the fault - or the failure will go undetected and uncorrected
-( - Network Level Fault Injection
- Corrupt
- Drop
- Reorder
9Network Level Fault Injection
10Modified Network Level Fault Injection
This allows a fault injector to intercept an
entire middleware message, and thus we can decode
it and modify specific parts of it.
11Grid-FIT
12Injecting Faults in a Production Environment
13System Model
14Extended Fault Model
15Extended Failure Model
16Failure Detection
17Application to Globus
- Initial experiments were based around Web
Services - This resulted in the WS-FIT tool
- (Web Service - Fault Injection Technology)
- Ultimate aim was to apply this method to Grids
- This has resulted in the Grid-FIT tool
- Modifications and initial experiments have been
conducted - Modified hooks to work with Globus
- Adapted FIT decoding to Globus message structure
- Repeated an earlier set of experiments rewritten
for Globus 4
18Test Case
19Results
20Future Work
- Apply Grid-FIT to complex systems
- CoLaB
- Short for Collaboration of Leeds and Beihang, is
a joint laboratory founded by the Beihang
University, PRC University of Leeds, UK. in
2005. - The primary mission of CoLaB is research in
Software and Security, each linked through a
common objective - To support the needs of the next generation of
Internet computing. - CROWN
- Short for China Research and Development
environment Over Wide-area Network, is a grid
test bed to facilitate scientific activities in
different disciplines. - We are currently working on integrating Grid-FIT
with CROWN - This will give Grid-FIT a large test bed to
refine its method and models - This will give CROWN a native Dependability
Assessment method - Part of the integration will be to integrate
Grid-FIT as an Eclipse plug-in
21Demonstrations Workshop
- Demonstrations
- Venue White Rose Grid Stall
- Wednesday 20th September 1345 1430
- Thursday 21st September 1000 - 1045
- CROWN Tianyu Wo woty_at_act.buaa.edu.cn
- FT-Grid Paul Townend pt_at_comp.leeds.ac.uk
- Grid-FIT Nik Looker nlooker_at_comp.leeds.ac.uk
- Mini-Workshop on UK-China e-Science
Collaborations - Venue Conference Room 1
- Wednesday 20th September 1700 - 1900