Title: http:www'qub'ac'ukescience
1Health Technologies and Grid Computing
http//www.qub.ac.uk/escience
2The UK e-Science Programme
- Ron Perrott
- Director
- Belfast e-Science Centre
- Queens University
- Belfast
- r.perrott_at_qub.ac.uk
- http//www.qub.ac.uk/escience
3Grid The Web on Steroids
Grid Flexible access to significant resources
4Why now?
- The Internet as infrastructure
- Increasing bandwidth, advanced services
- Advances in storage capacity
- Price reduction
- Increased availability of compute resources
- Clusters, supercomputers, etc.
- Advances in application concepts
- Simulation-based design, collaborative
engineering, ...
5UK e-Science Grid
- Provide a national grid resource
- Industrial and pilot projects advance grid
middleware - Act as information centres
Edinburgh
Glasgow
Newcastle
Belfast
Manchester
Cambridge
Oxford
Cardiff
London
Southampton
6e-Science Phase 2
- Key e-Science Infrastructure components
- Persistent National e-Science Research Grid
- Grid Operations Centre
- Open Middleware Infrastructure Institute
- National e-Science Institute
- National Digital Curation Centre
- e-Science/Grid Legal Service
- International Standards Activity
7Where is this all leading?
e-Science and the Grid
e-Business - Grid Services
Industrial Strength Infrastructure and Middleware
e-Science
8Belfast e-Science Centre Projects
- GridCast Television/Radio broadcasting
- Business change, resilience, reliability, cost,
customisation, interoperability - RiskGrid Financial Services
- Business change, performance, cost, resilience,
reliability, interoperability - Geddm High-performance data mining
- Performance, cost, business change, resilience,
interoperability - GeneGrid Bioinformatics
- Performance, cost, business change,
interoperability - GridMil Military infrastructures
- Resilience, reliability, performance,
interoperability, agility, cost
9The GeneGrid Project
- Dr Paul DonachyCommercial Director
10GeneGrid
- GeneGrid Business Drivers
- Architecture
- GeneGrid Benefits
11GeneGrid
- Collaborative Industrial RD project
- Stakeholders
- Fusion Antibodies
- Amtec Medical
- Support from BT plc
- 820,000 (DTI Link funding)
12Business Drivers
- Multi site organisation with little
collaboration. - Little dedicated HPC resource
- Economic advantage (peak demand/supply min)
- Composite analysis manual cut paste
- No single application
- Iterative analysis
- Technical Issues
- Data sources formats
- Federated view
- Lack common model
- Skills set
13Aim
- Create "virtual Bioinformatics laboratory
- Platform for Biologists and Scientists to access
collective skills, experiences and results in
secure, reliable and scaleable manner. - Integrate Biological Software Applications
- Data Sources Public / Private
- High Performance Computing facilities
- Tools
- aid multi sited federated database integration
- Workflow create, submission, monitoring
- Intuitive portal , non-programmer dependant
14Why use Grid Technology
- Access to vast computing and data resources
- Reduce TCO of IT infrastructure
- Increased business agility
- to tackle large-scale problems on-demand
- Empowers organisations
- towards better collaboration within and between
organisations
15Architecture
GeneGrid Environment
GeneGrid Portal Manager
GeneGrid Security Manager
GDM Service
GeneGrid Environment 2
GeneGrid Environment n
GeneGrid Workflow Manager
GDM Service
GeneGrid Admin Manager
GAM Service
GAM
GAM
TMHMM
bl2seq
GAM Service
SignalP
TMHMM
GAM Service
GAM Service
SignalP
RP
TMHMM
bl2seq
RP
EMBOSS
Primer3
GeneWise
EMBOSS
ClustalW
HMM
BLAST
DB query
6p SMP sparc (solaris 7)
DB query
RP
Eliminator
I686 Linux Sparc (Solaris 8)
QUB
4p SMP linux
4p SMP linux
BT Data Centre
University Melbourne
32 x Sun Blade linux
SDSC
Belfast e-Science Centre
16SOA OGSA
- Service Oriented Architecture (SOA)
- reuse
- interoperability
- dynamic services
- Open Grid Services Architecture (OGSA)
- GT4 based on WS-RF
- IT vendors
- GGF,OASIS
17GeneGrid Benefits
- A - Identification of Novel Protein Family
Members - B Automated Antigenic Region Detection
18A - Results
- 6 uncharacterised and potentially new siglecs
- Current business process - 1 day
- GeneGrid 20 mins
- Different applications were accessed from
different resources - BLAST Linux Cluster at BeSC
- TMHMM Linux Cluster at SDSC
- SignalP Sun SMP machine at QUB
19B - Results
- Current business process 30/60mins per Gene
- GeneGrid 90mins for 100 Genes
- Resources used
- BeSC, BT Datacentre, Uni Melbourne, SDSC
- Individual task execution and overall experiment
execution times reduced - High throughput analysis of genes for potential
antigenic regions
20GeneGrid Benefits
- SOA recycling software
- Open Extensible (EAI framework)
- Security
- Data Management
- Lower CAPEX, variable OPEX
- Biologically meaningful experiments as workflows
21Commercial Use Cases
Dr Shane McKee Amtec Medical
http//www.qub.ac.uk/escience
22Target discovery
- Functional characterisation
- Sequence analysis
- Pattern recognition
- GeneOntology
- Gene annotation
- Cross-species comparison
23GeneGrid in action
- Ion channel identification
- Cancer gene characterisation
- Infectious disease investigation
- Novel SIGLEC family members (Fusion Antibodies)
24Ion channels
- Transmembrane proteins involved in ion flux into
out of cells - Regulate electrical activity in brain
- Mutated in several forms of epilepsy
- Prime targets for novel anti-epilepsy drugs
25Example Procedure
- Identify all ion channels (GeneOntology)
- Identify transmembrane pore domains (TMHMM)
- Filter list (LOCAL)
- Compare list against whole genome sequences
intra trans-species (BLAST) - Identify novel genes in chromosomal regions of
interest (LOCAL) - Screen genes for mutations in families
26(No Transcript)
27Analysis of ANKH
- BLAST against other sequences
- Human
- Other organisms
- CLUSTAL view of sequence alignments
- Confirmation of transmembrane elements
- Detection of protein domains
28Using the Virtual Bioinformatics Laboratory
- Practice makes perfect
- Open source means transparency
- Legacy and vanilla file formats allow
integration of Grid technologies with local
applications - Streamlining of hypothesis-testing
- Still needs smart hypotheses!
29GeneGrid Demo
P.V. Jithesh Belfast e-Science Centre
http//www.qub.ac.uk/escience
30Discussion Forum
Dr Peter Donnelly BioBusinessNI