AFRICAN INSTITUTE FOR MATHEMATICAL SCIENCES(AIMS) - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

AFRICAN INSTITUTE FOR MATHEMATICAL SCIENCES(AIMS)

Description:

In the past 2 weeks or so, we have been discussing some of the major topics in ... NEMATODE. DROSOPHILA MELANOGASTER, a fruit fly ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 36
Provided by: usersA9
Category:

less

Transcript and Presenter's Notes

Title: AFRICAN INSTITUTE FOR MATHEMATICAL SCIENCES(AIMS)


1
AFRICAN INSTITUTE FOR MATHEMATICAL SCIENCES(AIMS)
  • REVIEW COURSE ON BIOINFORMATICS
  • 17 MARCH 4 APRIL, 2008
  • LECTURER DELE OLUWADE

2
EPILOGUE/CONCLUSION
  • In the past 2 weeks or so, we have been
    discussing some of the major topics in the
    emergent field of Bioinformatics.
  • Some of the basic definitions in the field have
    been included in the various lecture modules,
    exercises, tutorial questions and assignments.

3
  • This week, a synthetic summary of the discussions
    in the past 2 weeks shall be presented.
  • It is hoped that some of the general and
    particular open problems in the field would
    attract the interest of the class, with a view to
    making substantial contribution to this field
    which touches on human health.

4
OUTLINE OF THIS PRESENTATION
  • In this lecture module, the following, among
    others would be discussed
  • (A) Overview of the Chemical Composition of
    Humans
  • (B) Practical illustration of the discussion in
    this course
  • (C) Some Fundamental Problems of Bioinformatics
  • (D) Relevant Mathematical Computational Tools

5
A. CHEMICAL COMPOSITION OF HUMANS
  • To say that Bioinfomatics is a wide field is an
    understatement.
  • This is not unexpected considering the fact that
    Bioinformatics, as an interdisciplinary
    field,runs across at least 4 fields of study,
    including Biology,Computer Science,Mathematics
    and Statistics.

6
(No Transcript)
7
  • The human genome contains about 3 billion base
    pairs.
  • These pairs are arranged into 23 pairs of
    chromosomes.
  • A small subset of the long sequence creates the
    c. 20,000 human genes
  • Most of these genes code for the proteins which
    determine a persons biochemical makeup
    physical characteristics.

8
  • The remaining genes, about 98- are noncoding DNA
  • In certain sections of the human genome,these
    noncoding DNA contains repeated patterns of 2-5
    nucleotides
  • The number of repeats in each sequence varies
    from person to person.

9
  • For a start
  • QUESTION NO 1
  • HOW CAN WE EXPLAIN, FOR INSTANCE, THE DIFFERENCE
    IN HEIGHTS OF 2 PERSONS IN THE LIGHT OF WHAT HAVE
    BEEN DISCUSSED IN THIS COURSE ON BIOINFORMATICS?

10
HUMAN TRAIT DIFFERENCE
  • The difference in height relates to the HMGA2
    gene.
  • For tall persons, a C is written in the DNA code
    instead of a T i.e they inherit the
    C-containing copy of the DNA code.
  • One copy can add about 0.5 cm in height, two
    copies about 1cm etc

11
  • QUESTION NO 2
  • WHAT HAPPENS WHEN A DNA SAMPLE OF A PERSON IS
    TAKEN FOR THE PURPOSE OF RESOLVING CRIMINAL
    PROBLEMS, PATERNITY PROBLEMS etc?

12
Q2 DNA ANALYSIS PROCEDURE
  • (i) DNA sample is extracted from a cheek swab OR
    blood OR semen OR hair
  • (ii) DNA is isolated from other cellular
    components in the laboratory by a technician or a
    robotic instrument.

13
  • Extracted DNA passes through a standard method of
    creating many additional copies of a selected
    segment of DNA.
  • This method or process is called POLYMERASE CHAIN
    REACTION(PCR)?
  • For this process, there are 10 relevant sites in
    the United Kingdom and 13 in the U.S.A these are
    the targets of the PCR method.

14
  • The database system established in the U.S.A to
    link existing local, state and federal systems is
    knon as the COMBINED DNA INDEX SYSTEM (CODIS)?

15
  • Then, a genetic analyzer separates the resulting
    10 or 13(as the case may be) DNA fragments, and
    measures the number of repeats in each fragment.
  • For instance, repeats may be MICROSATELLITE
    REPEATS or SINGLE-NUCLEOTIDE POLYMORPHISMS

16
EXAMPLE OF MICROSATELLITE REPEATS
  • GTC GAT CCA CAC ACA CAC ACA CAT CGA
    TTC (a)?
  • TCG ATC CAC ACA CAC ACA CAC ACA TCG
    ATT (b)?
  • ATC CAC ACA CAC ACA CAC ACA CAC ACA TCG
    (c)?
  • NOTE
  • The dinucleotide repeat CA is reiterated 8
    times in (a), 9 times in (b) and 12 times in ( c
    ) at different positions in the genome.

17
EXAMPLE OF SINGLE-NUCLEOTIDE POLYMORPHISMS
  • ACGGGTCGTCGATCCATCGATTCGTAGCTAT
  • ACGGGTTGTCGATCCATCGATTCGTAGCTAT
  • NOTE
  • Single-nucleotide polymorphism(SNP, pronounced
    snip) occurs when the same stretch of DNA
    varies from one copy of a chromosome to another
    by a single nucleotide.

18
  • For the purpose of forensic analysis, only
    repeats at several positions (called LOCI) on the
    genome are important.
  • The number of repeats at each locus is called an
    ALLELE

19
  • QUESTION 3
  • WHAT IS RESPONSIBLE FOR DIFFERENCES BETWEEN HUMAN
    GENOMES AROUND THE WORLD?

20
  • Research has shown that a lot of differences in
    the genomes of races of the world can be traced
    to single nucleotide polymorphisms (SNP)

21
(C) SOME FUNDAMENTAL PROBLEMS OF BIOINFORMATICS
  • 1.SEQUENCING AND COMPARISON OF THE GENOMES OF
    DIFFERENT SPECIES
  • - This activity involves copying several
    different human genomes many times breaking
    them into millions of small fragments.
  • - The sequence of bases of genomes varies from
    individual to individual
  • - Any 2 humans have the same complement of
    genes, but different alleles
  • - Draft sequencing of the following organisms
    has been completed

22
  • HUMAN GENOME
  • BACTERIAL GENOMES
  • YEAST
  • NEMATODE
  • DROSOPHILA MELANOGASTER, a fruit fly

23
  • 2. IDENTIFICATION OF THE GENES AND DETERMINATION
    OF THE PROTEINS THEY ENCODE
  • - Genes can be identified by methods confined
    to a single genome
  • OR
  • - by comparative methods which use information
    about one organism to understand another related
    one.

24
  • 3. UNDERSTANDING OF GENE EXPRESSION
  • i.e
  • - How do genes and proteins act in concert to
    control cellular processes?
  • - Why do different cell types express
    different genes do so at different times?

25
  • 4. TRACING OF THE EVOLUTIONARY RELATIONSHIPS
    AMONG EXISTING SPECIES THEIR EVOLUTIONARY
    ANCESTORS

26
  • 5. FINDING A SOLUTION TO THE PROTEIN FOLDING
    PROBLEM
  • - i.e Giving the linear sequence of amino
    acids in a protein,what is the 3-dimensional
    structure into which it folds?
  • - This is still an unsolved problem.

27
  • 6. DISCOVERY OF THE ASSOCIATIONS BETWEEN GENE
    MUTATIONS DISEASE
  • - Diseases like Huntingtons disease Cystic
    fibrosis are caused by a single mutation
  • - Diseases like Diabetes, Cancer Heart
    disease are caused by both genetic
    environmental factors

28
  • (i) Storing of billions of bases of DNA, RNA and
    protein sequence data in public databases
  • - COMPUTER PROGRAMMING
  • - DATABASE DESIGN

29
D.RELEVANT MATHEMATICAL AND COMPUTATIONAL TOOLS
  • (ii) Conversion of laser-scanned traces into
    sequences of As, Cs, Gs and Ts
  • - DYNAMIC PROGRAMMING (ALGORITHM)?
  • - MACHINE LEARNING
  • - FOURIER ANALYSIS

30
  • (iii) Assembling of millions of fragments into
    large pieces
  • - COMBINATORIAL THEORY
  • - GRAPH THEORY
  • - PROBABILITY THEORY
  • - STATISTICS

31
  • (iv) Identification of new genes from an assembly
    of genome
  • - HIDDEN MARKOV MODEL
  • - STOCHASTIC CONTEXT-FREE GRAMMARS

32
  • (v) Comparison of new DNA sequences to those in
    the databases
  • - FINITE STATE MACHINES OTHER STRING
    SEARCH ALGORITHMS
  • - SEQUENCE ALIGNMENT ALGORITHMS(mainly
    involves DYNAMIC ALGORITHM)?
  • - STATISTICAL SCORING

33
  • (vi) Knowing about the proteins which genes
    encode
  • - HIDDEN MARKOV MODEL
  • - DATA STUCTURES (for clustering
    phylogenetic tree analysis)

34
  • (vii) Studying of the Structure function of
    protein
  • - CODING THEORY

35
REFERENCES
  • 1. Ostrander, Elaine A (2007), Genetics and the
    Shape of Dogs, American Scientist, Vol. 95,406
    413
  • 2. Karp,Richard M (2002),Mathematical Challenges
    From Genomics and Molecular Biology, Notices of
    the American Mathematical Society, Vol. 49, No.
    5,544-553
  • 3. Cole, Simon A (2007),Double Helix Jeopardy,
    IEEE Spectrum, August, 38-43
Write a Comment
User Comments (0)
About PowerShow.com