Cheminformatics - PowerPoint PPT Presentation

About This Presentation
Title:

Cheminformatics

Description:

The less favourable conformation (b) has atoms in eclipsed configuration. ... of a tetrahedrally coordinated saturated carbon atom in an organic molecule ... – PowerPoint PPT presentation

Number of Views:648
Avg rating:3.0/5.0
Slides: 56
Provided by: tvisw
Category:

less

Transcript and Presenter's Notes

Title: Cheminformatics


1
  • Cheminformatics Pharmainformatics

2
In this presentation
  • Part 1 Molecular Conventions
  • Part 2 Resources
  • Part 3 Drug Design
  • Part 4 Drug Development

3
Part1
  • Molecular Conventions

4
Cheminformatics
  • It is a combination of chemistry and information
    technology, is required for the processing and
    analysis of chemical data
  • Cheminformatics is relevant to biologists because
    chemistry data are important in many areas of
    molecular biology, e.g, in the study of protein
    interactions and metabolism

5
Molecular formulae
  • Molecules can be represented by simple formulae,
    which give the number and type of atoms
  • However, this does not show how they are
    connected
  • Structural formulae provide some information
    about the arrangement of atoms in a molecule and
    thus allow isomers to be distinguished

6
Structural representation of ethane that show
tetrahedral distribution of coordinated groups
about saturated carbon atoms. Panels (a) and (b)
show two extreme conformations. The energetically
favourable conformation (a), which predominates
in nature, has H atoms on opposite sides of C-C
bond as far as possible from each other (in the
staggered configuration). The less favourable
conformation (b) has atoms in eclipsed
configuration. Panels (c) and (d) show
conformations viewed from the end of molecule
(b)
(a)
(d)
(c)
7
Structural formulae and full and simplified
structural diagrams for some common organic
compounds
Name Formula Full structure Simplified structure
Methane CH4 H H C H H
Ethane C2H6 H H H C C H H H
Ethene (ethylene) C2H4 H H H C C H
8
Structural formulae and full and simplified
structural diagrams for some common organic
compounds
Name Formula Full structure Simplified structure
Cyclohexane C6H12
Ethanol C2H5OH H H H C C O H H
Ethenal (acetaldehyde) CH3CHO H H H C C O H
9
Structural diagrams
  • Molecules can be represented using simple graphs,
    which show atoms as nodes and bonds as links
  • For organic molecules, further simplification is
    achieved by assuming that carbon atoms make up
    the molecular backbone and that the valency of
    four is satisfied by hydrogen atoms unless
    otherwise shown
  • Such diagrams present all molecules as planar
    shapes an do not indicate the spatial
    distribution of atoms in 3D

10
Chirality
  • If four different groups are coordinated around a
    central carbon atom, the molecule is described as
    chiral
  • Chiral molecules exist in two conformations,
    enantiomers, which are mirror-images of each
    other
  • Although enanciomers have the same chemical
    properties, many enzymes and other proteins show
    chiral sensitivity, which is important in drug
    development and related fields

11
Multi-chiral configuration
  • Molecules may contain any number of chiral
    centers and a series of forms, called
    distereoisomers, may exist
  • These may have different chemical properties
    because of the way different groups interact
    within the molecule

12
DL and RS conventions
  • The absolute configuration of groups around a
    chiral carbon atom can be described using a
    number of conventions
  • In the DL system, molecules are named D or L
    according to whether the coordinated groups are
    arranged in a similar fashion to those in
    D-glyceraldehyde or L-alanine
  • In the RS system, molecules are named R (rectus)
    and S (sinister) according to the size of
    chemical groups surrounding the carbon atom

13
Representation of a tetrahedrally coordinated
saturated carbon atom in an organic molecule(a)
the carbon atom is at the centre of a tetrahedron
with four coordinated groups(b) simplified
representation with the central carbon
removed(c) Representation of the tetrahedron as
a flat image
C
(c)
(a)
(b)
14
Chirality representation
(a) The structural formula of glyceraldehyde
gives no indication of its chirality
CH2OHCHOHCHO
L
D
D
L
(b) if the molecule is represented as a
tetrahedron, the D and L enantiomers can be
distinguished
(c) these can be shown as 2D graphs using the
Fischer convention
15
Part2
  • Resources

16
SMILES
  • SMILES is a system for representing chemical
    formulae as strings, based on a valence model in
    which all valencies are considered to be
    satisfied by hydrogen atoms unless otherwise
    shown
  • The system has conventions for representing
    different bond types, cyclic molecules, branches,
    cis/trans isomers and chirality

17
RasMol and Chime
  • There are several specialized data formats for
    chemical structures based on the principle of a
    molecular formula and associated table of
    connections
  • Viewing utilities such as RasMol and Chime can
    interpret these file formats and display
    interactive molecular structures in a variety of
    user-defined schemes and colors

18
Chemical structure and databases
  • Structural information about different molecules
    can be obtained from a number of comprehensive
    WWW resources, including Chemical Abstracts
    On-Line, Chemfinder and MedChem
  • Each of these resources provides a chemical
    database that can be searched using a variety of
    query formats, e.g., systematic name,
    non-systematic name, formula, molecular weight or
    CAS registry number
  • Search results provide physical, chemical and
    biomedical information with links to other
    databases and resources
  • MedChem also provides the SMILES string

19
QSAR
  • A QSAR is a statistical method used to determine
    how the structural features of a molecule are
    related to biological activity
  • The QSAR approach is particularly useful for
    categorizing the activities of related molecules
    with multiple functional groups
  • Each molecule is broken down into a series of
    descriptors (molecular properties) and the QSAR
    determines which descriptors are most likely to
    promote biological activity
  • This gives rise to a set of rules that can be
    used to evaluate the potential activity of new
    molecules

20
Part3
  • Drug Design

21
Pharmainformatics
  • Pharmainformatics is the combination of biology,
    chemistry, mathematics and information technology
    that is essential for efficient data management,
    processing and analysis in the pharmaceutical
    industry

22
Drugs
  • Drugs interact with targets, usually proteins, in
    the body and through interactions cause
    physiological responses
  • The pharmaceutical industry aims to discover
    drugs with specific beneficial effects to treat
    human diseases

23
Gene drug life
  • To know a genes chemical structure and
    composition is one thing, but understanding its
    actual function is another thing
  • Though the sequencing and analysis would help in
    answering questions on aging, diseases,
    disorders, and many more, a new discipline of
    designer drugs is around the corner waiting for
    someone to tap
  • Even a single nucleotide polymorphism (SNP,
    pronounced snips), a T, for instance, in one of
    the gene sequence, where the neighbour has a C,
    can spell trouble

24
Gene drug life
  • Many drugs work only on 30 percent of human
    population
  • In extreme cases, a drug that saves one person
    may poison another. For instance, a type II drug
    Rezulin, which has been linked to more than 60
    deaths from liver toxicity worldwide
  • This is where in silico drug design would help
    not only in reducing the designing, modeling and
    testing time but also reducing the expenditure in
    manpower, resources and on various phases of drug
    design and development

25
Areas of drug design
  • For drug design, the process must be viewed from
    three different dimensions viz., drug design for
  • Diseases such as HIV, cancer, etc. that have been
    beating the people
  • Life style drugs
  • Drugs for repairing genetic disorders
  • There is an immanent need for evolving drugs for
    diseases such as hepatitis C, leprosy and malaria
    since these diseases are wide spread and trouble
    the people at large
  • Other infectious diseases such as tuberculosis,
    HIV, etc. are also highly troublesome

26
In silico drug design
  • Earlier, the drug design process used to take
    many decades and was carried out haphazardly
    without any direction whereas presently there is
    a systems approach. Added to this are tremendous
    reduction in research and production costs
  • Already the surge in bioinformatics solutions has
    redefined the way drug trials are done making a
    shift from in vitro to in silico
  • In silico drug design could be used to shorten
    the time of drug design and this issue shall
    remain the biggest challenge for years to come

27
Drugs are insoluble in water
  • A large portion of proteins constitute water
    (2/3rd of human body consists of water) and hence
    do not behave like rigid bodies due to the
    presence of water in the cells and consequently,
    the behavioural pattern differs from protein to
    protein
  • Drugs normally do not dissolve in water.
    Designing of drugs in silico (on chips, without
    water) should consider this point

28
Important areas for drug design
  • The four most important areas of consideration
    for successful drug design are the
  • binding sites
  • molecular shape
  • molecular size
  • inhibitory properties of the proteins

29
Important areas for drug design
  • The study related to crystallization of membrane
    protein structure also plays a vital role in drug
    design. This area of research would be highly
    challenging and would prove to be an excellent
    foundation for further research
  • Since the sequence size of dengue virus is just
    about 11 KB, it would be highly useful for
    carrying out lot of work quickly and conveniently

30
Medical applications
  • Bioinformatics and drug design can be highly
    useful for diagnosis and treatment of various
    neurological disorders. It has been found that
    many neurological disorders are due to unusual
    gene structures like the triple A formation
    AAA (the A of ATGC nucleotides) in the genes.
    The problem becomes more complex with multiple
    repeats or occurrences of triple A. More than
    eight such repeats are known and in such cases
    children are permanently bed ridden or has to use
    wheel chairs

31
Part4
  • Drug Development

32
Bioinformatics in drug development
  • Genomics, proteomics, combinatorial chemistry and
    high-throughput screening (HTS) have all
    contributed to a massive increase in the amount
    of data generated by the pharmaceutical industry
  • The role of bioinformatics is to store, track and
    provide tools for the analysis of these data
    some thing like an automated environment

33
Bioinformatics in drug development
  • Specific applications include the modeling of
    protein interactions with small molecules
    allowing rational drug design, the association of
    genotype and drug response patterns
    (pharmacogenomics), the design and assessment of
    chemical diversity in combinatorial libraries,
    and the processing and storage of data from
    high-throughput screens of lead compounds

34
Areas of biology
Application Role of bioinformatics
Genomics/proteomics (human genome project) Genomics/proteomics (human genome project)
Characterization of human genes and proteins Target identification/ validation in the human genome Cataloging SNPs and association with drug response patterns (pharmacogenomics)
Genomics/proteomics (human pathogen genome project) Genomics/proteomics (human pathogen genome project)
Characterization of genes and proteins of organisms that are pathogenic to humans Target identification/ validation in pathogens
Functional genomics (protein structures) Functional genomics (protein structures)
Analysis of protein structures (humans and their pathogens) Prediction of drug/target interactions Rational drug design
35
Areas of biology
Application Role of bioinformatics
Functional genomics (expression profiling) Functional genomics (expression profiling)
Determining gene expression patterns in disease and health Gene classification based on drug responses Pathway reconstruction
Functional genomics (genome-wide mutagenesis) Functional genomics (genome-wide mutagenesis)
Determining the mutant phenotypes for all genes in the genome Databases of animal models Target identification/ validation
Functional genomics (protein interactions) Functional genomics (protein interactions)
Determining interactions among all proteins Characterization of protein interactions Reconstruction of pathways Prediction of binding sites
36
Areas of chemistry
Application Role of bioinformatics
HTS HTS
Highly parallel assay formats for lead identification Storing, tracking and analyzing data
Combinatorial chemistry Combinatorial chemistry
Synthesis of large number of chemical compounds Cataloging chemical libraries Assessing library quality/ diversity Predicting drug/target interactions
37
Principles of drug development
  • Drug development begins with the identification
    of a suitable target, which must contribute
    significantly to a human disease
  • Ideally, altering the activity of this target
    should have a beneficial effect thus showing its
    potential for therapeutic intervention
  • The next stage of the process is lead discovery,
    where compounds showing some of the desired
    activity of an ideal drug are sought

38
Principles of drug development
  • Optimization of lead compounds results in drug
    candidates that may be registered and submitted
    for clinical trials, which establish their safety
    and metabolic behaviour in human subjects

39
Genetic link to drugs
  • An early example of the utility of bioinformatics
    in drug design is cathepsin K, an enzyme that
    might turn out to be an important target for
    treating osteoporosis, a crippling disease caused
    by the breakdown of bone
  • While analyzing the osteoclasts (cells that break
    down bone in the normal course of bone
    replenishment) taken from people with bone
    tumors, it was found that osteoclasts cells were
    over expressed and could be over active in
    individuals with osteoporosis
  • They matched with a previously identified class
    of molecules called cathepsins. Efforts are on
    to find a potential drug to block the cathepsin K
    target

40
Genetic link to drugs
  • Scientists believe that 99.9 percent of your
    genes perfectly match those of the person sitting
    beside you. But the remaining 0.1 percent of the
    genes vary and it is these variations in which
    the drug companies are interested in
  • Several years after the debut of tests for BRCA1
    and BRCA2, scientists are still trying to
    determine exactly to what degree those genes
    contribute to a womans cancer risk

41
Chemical diversity
  • Diverse chemical libraries are required for
    efficient lead discovery if little is known about
    the binding properties of the drug target
  • Conversely, focused libraries are required if the
    structure of the target is known, since this
    defines a particular set of ligands
  • Chemical diversity can be defined by comparing
    molecules on the basis of descriptors (functional
    groups) and how these fill chemical space
  • A number of software tools are available for the
    design and assessment of diverse or focused
    chemical libraries, virtual screening against
    drug targets

42
Computational screening
  • Software applications like DOCK and Autodock
    match potential ligands to binding sites by
    calculating steric constraints and bond energies
  • These can be used to search chemical databases
    and find potential drug leads
  • Some applications consider the ligand and binding
    site as inflexible structures, rather like pieces
    of a jigsaw, while others can incorporate
    flexibility into the molecules by calculating
    allowable and compatible bond torsions

43
Functional genomics
  • The large-scale functional annotation of genes is
    known as functional genomics and incorporates
    areas such as homology searching, structural
    analysis, expression analysis, large scale
    mutagenesis and the analysis of protein
    interactions
  • All of these areas are important in drug
    development

44
Genome-scale mutagenesis
  • Genome-scale mutagenesis is a rich source of
    animal disease models for target identification
    and validation, and large mutant collections in
    simple organisms can be used for the rapid
    high-throughput screening of potential lead
    compounds

45
Approaches in functional genomics
Approach Functional annotation method
Homology searching Comparison to related sequences with known function
Protein structure determination (structural genomics) Comparison to molecules with related structure and known function
Comparative genomics Functional annotation by domain conservation, conserved phylogeny or conserved genomic organization
Expression analysis Similar expression profiles indicate conserved function
Mutagenesis Function based on mutant phenotype, e.g. knockout mice
Protein interaction screening Function based on presence in multi-subunit complex or on interaction with proteins of known function
Small molecule informatics Interaction with small molecules
46
Pharmacogenomics
  • It is a study of how variation in the human
    population correlates with drug response patterns
  • The analysis of genomic data and its comparison
    with drug response data allows patients to be
    clustered into drug response groups, so that
    appropriate drugs and dose regimens can be
    administered
  • Variation is catalogued by analyzing data on
    mutation (particularly SNPs) and gene expression
    profiles

47
In lab vs. out of lab effort
  • The companies and individuals plug into the
    effort of drug design at various points
    collecting and storing data, searching databases,
    and interpreting the data
  • The race and competition is all about who can
    mine the massive information best
  • Just modeling or computing of the drug design or
    protein structure would not be sufficient, but
    lot of information on test results and clinical
    trials from outside are also very important
  • Most of the time should be spent on this aspect
    for ensuring success in drug design and
    development

48
Issues of drug design
  • Eventhough the human genome has been sequenced,
    there a number of problems awaiting for
    solutions technical, legal, and social
  • It is absolutely not clear as to how much must
    one know about a gene in order to patent it
  • There is also a necessity of reviewing all failed
    drugs, i.e., drugs failed during clinical trails
    since their molecular composition and
    experimentation process could give lot of
    valuable information

49
  • Various aspects connected to successful drug
    design include supercomputing, modeling of
    proteins through software, biotechnology,
    computational methods and analysis, biochemistry,
    in silico drug design, etc.
  • It is notable that a drug that works for protein
    A does not work for protein B or behaves
    differently due to various factors. That is why,
    many drugs could fail, and hence an integrated
    (team work) effort is required with tremendous
    amount of information and interactions

50
  • At the moment, many patent applications rely on
    computerized prediction techniques that are often
    referred to as in silico biology
  • With full or partial gene sequence, scientists
    enter the data into a computer program that
    predicts the amino acid sequence of the resulting
    protein
  • By comparing this hypothetical protein with known
    proteins, the researchers take a guess at what
    the underlying gene sequence does and how it
    might be useful in developing a drug, say, or a
    diagnostic test

51
  • Searches for compounds that bind to and have the
    desired effect on drug targets still take place
    mainly in a biochemists traditional wet lab,
    where evaluations for activity, toxicity and
    absorption can take years
  • But now with the bioinformatics initiatives,
    tools and growing databases of protein structures
    and biomolecular pathways, this aspect of drug
    development is shifting to computers
  • As the saying goes genomics without
    bioinformatics will not have much of a payoff

52
Ayurveda and tribal medicine
  • Till date, not much has been considered about the
    biodiversity, especially research and knowledge
    base on alternate medicine, Ayurveda,
    herbs/shrubs applications from remote villages,
    etc.
  • This area of medicine and study of their affect
    on genes and proteins could be another
    challenging and interesting area

53
Future of pharmainformatics
  • Drug companies collect the genetic know-how to
    make medicines tailored to specific genes an
    effort called pharmacogenomics
  • In the years to come, pharmacists may hand over
    one version of blood pressure drug based on your
    unique genetic profile, while the person behind
    in line would get a different version of the same
    medicine!!
  • There is going to be a day when somebody comes in
    with cancer, and diagnosis can be done not on the
    basis of morphology of the cancer but by looking
    at the detailed patterns of gene expression and
    protein-binding activities in that cell

54
Target for the industry
  • It is expected that in this decade, the
    pharmaceutical industry will be faced with
    evaluating up to 10,000 human proteins against
    which new therapeutics might be directed
  • That is 25 times the number of drug targets that
    have been evaluated by all the companies since
    the dawn of the industry

55
Resources
  • For a primer on genetic testing and a directory
    of genetic tests, visit GeneTests at
    www.genetests.org
  • For more on the ethical, legal and social
    implications of human genome research, visit the
    National Human Genome Research Institutes web
    site at www.nhgri.nih.gov/ELSI
Write a Comment
User Comments (0)
About PowerShow.com