In silico docking on grid infrastructures - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

In silico docking on grid infrastructures

Description:

In silico docking on grid infrastructures. Jean Salzemann ... Unchanged pharmacopeia for decades against trypanosomiasis, leishmaniasis, Chagas disease, ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 17
Provided by: marce247
Category:

less

Transcript and Presenter's Notes

Title: In silico docking on grid infrastructures


1
In silico docking on grid infrastructures
  • Jean Salzemann
  • LPC of Clermont-Ferrand, France (CNRS/IN2P3)
  • Embrace Workshop, Helsinki, 2006/06/17
  • Credit Nicolas Jacq, Vincent Breton

2
Content
  • WISDOM initiative
  • Challenges of the high throughput virtual docking
  • Development of a grid environments for a
    large-scale deployment
  • Achieved deployment on EGEE infrastructure
  • Wide In Silico Docking On Malaria
  • Accelerate drug design against H5N1 neuraminidase
  • Perspectives

3
WISDOM initiative
  • WISDOM initiative aims to demonstrate the
    relevance and the impact of the grid approach to
    address drug discovery for neglected and emerging
    diseases.
  • First achieved experiences
  • Summer 2005 Wide In Silico Docking On Malaria
    (WISDOM)
  • Spring 2006 Accelerate drug design against H5N1
    neuraminidase
  • Partners
  • Grid infrastructures EGEE, Auvergrid, TWGrid
  • European projects Embrace, BioinfoGrid, Share,
    Simdat
  • Institutes and association Fraunhofer SCAI,
    Academia Sinica of Taiwan, ITB, Unimo University,
    LPC, CMBA, CERN-ARDA, HealthGrid

4
There is a need
  • to develop new drugs for the diseases of the
    developing world
  • HIV/AIDS, malaria and Tuberculosis account for
    5,6 million deaths
  • Permanent necessity to develop new drugs to fight
    emerging resistance to drugs (malaria)
  • Unchanged pharmacopeia for decades against
    trypanosomiasis, leishmaniasis, Chagas disease,
    ...
  • to be able to develop quickly new drugs against
    emerging diseases
  • H5N1, SRAS, dengue are recent examples of
    emerging diseases
  • Many factors like world-wide exchanges can help
    propagation of such diseases at a large scale
  • Necessity to quickly adapt to emerging resistances

5
Phases of a pharmaceutical development
Molecular Docking Predict how small molecules,
such as substrates or drug candidates, bind to a
receptor of known 3D structure
Target discovery
Lead discovery
Target Identification
Target Validation
Lead Identification
Lead Optimization
Clinical Phases (I-III)
Duration 12 15 years, Costs 500 - 800 million
US
6
Grid-enabled virtual screening workflow
Grid service customers
Biology teams
Chemist/biologist teams
Data access for expert teams in the world
Check point
Check point
Check point
Grid infrastructure
Selected hits
Target
Hits
Annotation services
Docking services
MD services
Grid service providers
Chimioinformatics teams
Bioinformatics teams
7
Challenges for high throughput virtual
dockingExample data challenge against H5N1 NA
Millions of chemical compounds available in
laboratories
In vitro high Throughput Screening 1/compound,
nearly impossible
  • 300,000 Chemical compounds
  • ZINC
  • Chemical combinatorial library

Molecular docking (Autodock) 100 CPU years, 600
GB data
Data challenge on EGEE, Auvergrid, TWGrid 6
weeks on 2000 computers
In vitro screening of 100 hits
Hits sorting and refining
Target (PDB) Neuraminidase (8 structures)
8
Issues for the grid-enabled high throughput
virtual docking
  • Computer-based in-silico screening can help to
    identify the most promising leads for biological
    tests
  • systematic and productive
  • reduces the cost of trail-and-error approach
  • In silico docking is well-fitted for grid
    deployment
  • CPU intensive application
  • Huge amount of output
  • No communication between tasks
  • Issues of a large scale grid deployment
  • The rate of submitted jobs must be carefully
    monitored
  • The amount of transferred data impacts on grid
    performance
  • Grid process introduces significant delays
  • Licensed software requires licenses distribution
    strategy on grid

9
Grid tools of the data challenges
  • WISDOM
  • a workflow of grid job handling automated job
    submission, status check and report, error
    recovery
  • push model job scheduling
  • batch mode job handling
  • http//wisdom.eu-egee.fr
  • DIANE
  • a framework for applications with master-worker
    model
  • pull mode job scheduling
  • interactive mode job handling with flexible
    failure recovery feature
  • http//cern.ch/diane

10
WISDOM components
Installer
Tester
User
wisdom_install
wisdom_test
Set of jobs
wisdom_execution Workload definition Job
submission Job monitoring Job bookkeeping Fault
tracking Fault fixing Job resubmission
GRID Grid services (RB, RLS) Grid resources (CE,
SE) Application components (Software, database)
Superviser
License server
Accounting data
wisdom_collect
wisdom_site
wisdom_db
11
Simplified grid workflow for WISDOM
Results
Subsets
WISDOM production system
Site1
Statistics
Jobs
Parameter settings Target structures
Resource Broker
User interface
Site2
Subsets
Compounds database
Storage Element
Software
Results
  • FlexX license server
  • 3000 floating licenses given by BioSolveIT to
    SCAI
  • Maximum number of used licenses was 1008

12
Grid resources of the data challenges
  • EGEE-II
  • AuverGrid
  • TWGrid
  • a world-wide infrastructure providing freely over
    than 5,000 CPUs and 21 TB for biomedical
    applications

13
First biomedical data challenge World-wide In
Silico Docking On Malaria (WISDOM)
  • Significant biological parameters
  • 2 different docking applications (Autodock and
    FlexX)
  • About 1 million virtual compounds selected
  • Target proteins from the parasite responsible for
    malaria
  • Significant numbers
  • Total of about 46 million ligands docked in 6
    weeks
  • 1TB of data produced
  • Up 1700 computers in 15 countries used
    simultaneously
  • About 80 CPU years
  • Average crunching factor 600

Number of docked compounds vs time
Number of running and waiting jobs vs time
14
Second biomedical data challenge Accelerate drug
design against H5N1 neuraminidase
  • Significant biological parameters
  • 1 docking application (Autodock)
  • About 300,000 virtual compounds selected
  • Target proteins with predicted mutations involved
    in the virus multiplication
  • Significant numbers
  • Total of about 2,5 million ligands docked in 6
    weeks
  • 600 GB of data produced
  • Up 2000 computers in 17 countries used
    simultaneously corresponding to about 105 CPU
    years
  • Average crunching factor 900

Rate of jobs by EGEE federation
15
Selecting the promising compounds
  • The in-silico screening provides not only the
    docking poses of a compound against the target
    but also the docking energy
  • By ranking the information, chemist can select
    the promising compounds to go on the
    structure-based drug design for potential drugs

16
Perspectives
  • Second large scale docking on EGEE in fall 2006
  • Several new foreseen targets on malaria, dengue
    and other neglected diseases.
  • Resources needed 80 CPU years per target
  • Supported by EGEE-II and EELA european projects,
    Swiss BioGrid initiative
  • Collaboration is open for new targets,software
    infrastructures
  • Reranking of WISDOM hits by Molecular Dynamics
    simulations
  • Supported by BioinfoGrid EGEE-II european
    projects
  • Interest for ressources on supercomputers
    (contact with DEISA)
  • Best hits further processed through in vitro
    testing and structure activity relationships
Write a Comment
User Comments (0)
About PowerShow.com