Integrative Biology exploiting e-Science to combat fatal diseases - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Integrative Biology exploiting e-Science to combat fatal diseases

Description:

Breakthroughs in biotechnology and IT have provided a ... Courtesy of Peter Kohl (Physiology, Oxford) Normal beating. Fibrillation. 7. Multiscale modelling ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 25
Provided by: davidga
Category:

less

Transcript and Presenter's Notes

Title: Integrative Biology exploiting e-Science to combat fatal diseases


1
Integrative Biology exploiting e-Science to
combat fatal diseases
  • Damian Mac RandalCCLRC

2
Overview of Talk
  • Project background
  • The scientific challenge
  • The e-Scientific challenge
  • Proposed system

3
Scientific background
  • Breakthroughs in biotechnology and IT have
    provided a wealth (mountain) of biological data
  • Key post-genomic challenge is to transform this
    data into information that can be used to
    determine biological function
  • Biological function arises from complex
    non-linear interactions between biological
    processes occurring over multiple spatial and
    temporal scales
  • Gaining an understanding of these processes is
    only possible via an iterative interplay between
    experimental data (in vivo and in vitro),
    mathematical modelling, and HPC-enabled simulation

4
e-Scientific background
  • Majority of the first round of UK e-Science
    Projects focused primarily on data intensive
    applications (Data storage, aggregation, and
    synthesis)
  • Life Sciences projects focused on supporting the
    data generation work of laboratory-based
    scientists
  • In other scientific domains, projects such as
    RealityGrid, GEODISE, and gViz began to consider
    compute-intensive applications.

5
The Science and e-Science Challenge
  • To build an Integrative Biology Grid to support
    applications scientists addressing the key
    post-genomic aim of determining biological
    function
  • To use this Grid to begin to tackle the two
    chosen Grand Challenge problems the in-silico
    modelling of heart failure and of cancer.
  • Why these two? together they cause 61 of all
    deaths in the UK

6
Courtesy of Peter Kohl (Physiology, Oxford)
Normal beating
Fibrillation
7
Multiscale modelling of the heart
MRI image of a beating heart
Fibre orientation ensures correct spread of
excitation
Contraction of individual cells
Current flow through ion channels
8
Heart modelling
  • Typically solving coupled systems of PDEs (tissue
    level) and non-linear ODEs (cellular level) for
    the electrical potential
  • Complex three-dimensional geometries
  • Anisotropic
  • Up to 60 variables
  • FEM and FD approaches

9
Details of test-run of Auckland heart simulation
code on HPCx
  • Modelled 2ms of electrophysiological excitation
    of a 5700mm3 volume of tissue from the left
    ventricular free wall
  • Noble 98 cell model used
  • Mesh contained 20,886 bilinear elements (spatial
    resolution 0.6mm)
  • 0.05ms timestep (40 timesteps in total)
  • Required 978s CPU on 8 processors and 2.5 Gbytes
    of memory
  • A complete simulation of the ventricular
    myocardium would require up to 30 times the
    volume and at least 100 times the duration
  • Estimated max compute time to investigate
    arrhythmia 107s (100 days) requiring 100Gb of
    memory (compute time scales to the power 5/3)
  • At high efficiency this scales to approximately 1
    day on HPCx

10
Multiscale modelling of cancer
11
Cancer modelling
  • Focusing on avascular tumours
  • Current models range from discrete
    population-based models and cellular automata, to
    non-linear ode systems and complex systems of
    non-linear PDEs
  • Key goal is the coupling (where necessary) of
    these models into an integrated system which can
    be used to gain insight into experimental
    findings, to help design new experiments, and
    ultimately to test novel approaches to cancer
    detection, and new drugs and treatment regimes

12
Summary of the scientific challenge
  • Modelling and coupling phenomena which occur on
    many different length and time scales
  • 1m person
  • 1mm tissue morphology
  • 1mm cell function
  • 1nm pore diameter of a membrane protein
  • Range 109
  • 109 s (years) human lifetime
  • 107 s (months) cancer development
  • 106 s (days) protein turnover
  • 103 s (hours) digest food
  • 1 s heart beat
  • 1 ms ion channel gating
  • 1 ms Brownian motion
  • Range 1015

13
The e-Science Challenge
  • To leverage the first round of e-Science projects
    and the global Grid infrastructure to build an
    international collaboratory which places the
    applications scientist within the Grid allowing
    fully integrated and collaborative use of
  • HPC resources (capacity and capability)
  • Computational steering, performance control and
    visualisation
  • Storage and data-mining of very large data sets
  • Easy incorporation of experimental data
  • User- and science-friendly access
  • gt Predictive in-silico models to guide
    experiment and, ultimately, design of novel
    drugs and treatment regimes

14
Key e-Science Deliverables
  • A robust and fault-tolerant infrastructure to
    support post-genomic research in integrative
    biology that is user and application driven
  • 2nd Generation Grid bringing together components
    across range of current EPSRC pilot projects

15
e-Science/Grid Research Issues
  • Ability to carry out reliably and resiliently
    large scale distributed coupled HPC simulations
  • Ability to co-schedule Grid resources based on a
    GGF-agreed standard
  • Secure data management and access-control in a
    Grid environment
  • Grid services for computational steering
    conforming to an agreed GGF standard
  • Development of powerful visualisation and
    computational steering capabilities for complex
    models
  • Contributing projects
  • RealityGrid, gViz, Geodise, myGrid, BioSimGrid,
    eDiaMoND, GOLD, various CCLRC projects, .

16
Service oriented Architecture
  • The user-accessible services will initially be
    grouped into four main categories
  • Job management
  • including deployment, co-scheduling and workflow
    management across heterogeneous resources
  • Computational steering
  • both interactive for simulation
    monitoring/control and pre-defined for parameter
    space searching
  • Data management
  • from straightforward data handling and storage of
    results to location and assimilation of
    experimental data for model development and
    validation
  • Analysis and visualization
  • final results, interim state, parameter spaces,
    etc, for steering purposes

17
Strawman Architecture
External Resources
Simulation Engine
IB Server
Data Management
Visualization
18
Software architecture
  • Underpinning development of the architecture are
    three fundamental considerations
  • standardization, scalability and security
  • Initially, Web service technology is being used
    for interactions between the system components
  • Many of the underlying components are being
    adopted from previous projects, and adapted if
    necessary, in collaboration with their original
    developers
  • Portal/portlet technology, integrated with the
    user's desktop environment, will provide users
    with a lightweight interface to the operational
    services
  • The data management facilities are being built
    using Storage Resource Broker technology to
    provide a robust and scalable data infrastructure
  • Security is being organized around Virtual
    Organizations to mirror existing collaborations
  • A rapid prototyping development methodology is
    being adopted

19
Demonstrators
  • Objectives
  • Immediate boost in size/complexity of problems
    scientists can tackle
  • Validation of the Architecture
  • Learning exercise, exploring new technology
  • Introduce scientists to potential of advanced IT,
    so they can better specify requirements
  • 4 Demonstrators, chosen for diversity

20
Demonstrators
  • Implementation of GEODISE job submission
    middleware (via MATLAB) using the Oxford JISC
    cluster on the NGS. (A simple cellular model of
    nerve excitation)
  • MPI implementation of Jon Whiteley and Prasanna
    Pathmanathans soft tissue deformation code (for
    use in image analysis for breast disease). (FEM
    code, non-linear elasticity)
  • MPI implementation of Alan Garnys 3D model of
    the SAN incorporating the ReG Steering Library
    (FD code for non-linear reaction-diffusion
    (anisotropic) plus an XML-based parser for
    cellular model definition)
  • CMISS modelling environment for complex
    bioengineering problems - Peter Hunter, Auckland,
    NZ Production quality FE/BE library plus
    front/back ends)

21
Resources
  • Project manager, project architect, 7.5 post-docs
    and 6 PhD students broken down into three main
    teams
  • Heart modelling and HPC 1.5 post-docs, 2 PhD
    students in Oxford, 0.5 post-doc at UCL. Led by
    Denis Noble and myself.
  • Cancer Modelling 1 senior post-doc and 2 PhD
    students in Nottingham, 1 post-doc and 1 PhD
    student in Oxford, 1 PhD student in Birmingham.
    Led by Helen Byrne in Nottingham. (Several
    further PhD students have also been funded from
    other sources)
  • Interactive services and Grid team Project
    architect plus 2 post-docs at CCLRC, 1 post-doc
    in Leeds, 0.5 post-doc at UCL
  • Note well over half of the effort is dedicated
    to the science

22
Current Status
  • Official project start date 1/2/04, recruitment
    of staff now complete
  • Initial project structure defined and agreed,
    initial requirements gathering and security
    policy exercises completed, initial architecture
    agreed
  • Heart-modelling and cancer modelling workshops
    held in Oxford in June, with talks by user
    communities
  • Cancer modelling meeting with all users in Oxford
    in July
  • Full IB workshop with all stakeholders in Oxford,
    29th September
  • Survey of capabilities of existing middleware
    under way (thanks to everyone who has given us
    lots of their time)
  • Four demonstrators identified and development
    commenced

23
Summary
  • Science-driven project that aims to build on
    existing middleware to (begin to) prove the
    benefits of Grid computing for complex systems
    biology i.e. to do some novel science
  • Huge and increasing (initial) buy-in from the
    user community
  • Challenge is to develop sufficiently robust and
    usable tools to maintain that interest.

24
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com