Medical Genomics - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Medical Genomics

Description:

We are in the midst of a 'Golden Era' of biology. The Human Genome Project has produced a huge storehouse of data that will be ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 73
Provided by: NBIF
Category:
Tags: genomics | medical | nccu

less

Transcript and Presenter's Notes

Title: Medical Genomics


1
Medical Genomics
  • The Impact of Genome Data and New Technologies
    on Health Care

Stuart M. Brown Research Computing, NYU School of
Medicine
2
A Genome Revolution in Biology and Medicine
  • We are in the midst of a "Golden Era" of biology
  • The Human Genome Project has produced a huge
    storehouse of data that will be used to change
    every aspect of biological research and medicine
  • The revolution is mostly about treating biology
    as an information science, not about specific
    biochemical technologies.

3
  • I. The Human Genome Project
  • II. Genomics
  • - microarrays
  • - SNP genotyping
  • III. The medical and business applications

4
The Human Genome Project
5
Bold Words from Francis Collins
  • The history of biology was forever altered a
    decade ago by the bold decision to launch a
    research program that would characterize in
    ultimate detail the complete set of genetic
    instructions of the human being.

Francis S. Collins Director of the National
Human Genome Research Institute N Engl J Med 1999
88242-65
6
A review of some basic genetics
7
(No Transcript)
8
DNA
  • 4 bases (G, C, T, A)
  • base pairs
  • G--C
  • T--A
  • genes
  • non-coding regions

9
Decoding Genes
10
  • The human genome is the the complete DNA content
    of the 23 pairs of human chromosomes - 44
    autosomes plus two sex chromosomes
  • - approximately 3.2 billion base pairs.

11
Genome Projects
  • Complete genomic sequences
  • Dozens of microorganisms
  • Yeast, C. elegans, Drosophila
  • Mouse
  • Human
  • Comparative genomics
  • All this data is enabling new kinds of research -
    for those with the computational skills to take
    advantage of it.

12
How does genome sequencing technology work?
  • Molecular biology of the Sanger method
  • Manual Gels vs. ABI machines
  • Sub-cloning of fragments - BAC, PAC, cosmid,
    plasmid, phage
  • The need for computers to assemble the "reads"
    and manage the workflow

13
(No Transcript)
14
(No Transcript)
15
  • Automated sequencing machines,
  • particularly those made by PE Applied
    Biosystems, use 4 colors, so they can read all 4
    bases at once.

16
(No Transcript)
17
Subcloning
  • DNA sequencers can only read small fragments of
    DNA 500-1000 bases long
  • It is necessary to break the genome into small
    pieces .
  • Individual chromosomes are cut into 1 million
    base chunks that are cloned into large vectors
    called BACs, PACs, and YACs.
  • These pieces can then be further cut into
    sequenceable pieces (1000 bases) and cloned into
    plasmid or phage vectors.

18
(No Transcript)
19
Raw Genome Data
20
Lots of Sequence Data
  • How to extract useful knowledge from all of this
    data?
  • Need sophisticated computer tools
  • Find the genes
  • Figure out what they do (function)
  • Diagnostic tests
  • Medical treatments

21
What is a Gene?
  • For every 2 biologists, you get 3 definitions
  • A DNA sequence that encodes a heritable
    trait.
  • The unit of heredity
  • Is it an abstract concept, or something you can
    isolate in a tube or print on your screen?
  • Classic vs. modern understanding of molecular
    biology

22
Classic Molecular Biology
  • A gene is a DNA sequence at a particular locus on
    a chromosome that encodes a protein.
  • The Central Dogma of Molecular Biology
  • DNA gt RNA gt Protein
  • A mutation changes the DNA sequence - leads to a
    change in protein sequence - or no protein.
  • Alleles are slightly different DNA sequences of
    the same gene.

23
Genome Confusion
  • The sequence of a gene in the genome includes
  • protein coding sequence
  • introns and exons
  • 5' and 3' untranslated regions on the mRNA
  • promoter and 5' transcription factor binding
    sites
  • enhancers??
  • What about alternative splicing?
  • Multiple cDNAs with different sequences (that
    produce different proteins) can be transcribed
    from the same genomic locus

24
Finding genes in genome sequence is not easy
  • About 1 of human DNA encodes functional genes.
  • Genes are interspersed among long stretches of
    non-coding DNA.
  • Repeats, pseudo-genes, and introns confound
    matters

25
  • The next step is obviously to locate all of the
    genes and describe their functions. This will
    probably take another 15-20 years!

26
How Many Genes?
  • The current estimate is 34,000 human genes.
  • The same number as the mouse, only about 5 times
    more than yeast.
  • Yet two different versions of the human genome
    (Celera vs. Ensembl/UCSC) show only about 50
    overlap between the genes that they have
    described.

27
All the Genes?
  • Any human cDNA can now be found in the genome by
    similarity searching with 99 certainty.
  • However, the sequence still has many gaps
  • unlikely to find a completely uninterrupted
    genomic segment for any gene
  • still cant identify pseudogenes with certainty
  • This will improve as more sequence data
    accumulates

28
Data Mining Tools
  • Scientists need to work with a lot of layers of
    information about the genome
  • coding sequence of known genes and cDNAs
  • genetic maps (known mutations and markers)
  • gene expression
  • cross species homology

29
(No Transcript)
30
UCSC
31
Ensembl at EBI/EMBL
32
(No Transcript)
33
(No Transcript)
34
II. Genomics
  • What is Genomics?
  • An operational definition
  • The application of high throughput automated
    technologies to biology.
  • A philosophical definition
  • A wholistic or systems approach to the study of
    information flow within a cell.

35
Genome Sequencing created Genomics
  • All genomics technologies depend on the data
    produced by genome sequencing
  • Do molecular biology in a massively parallel
    fashion using robotics and automated data
    collection
  • Build databases rather than ask questions about
    single genes or a single process

36
Genomics Technologies
  • Automated DNA sequencing
  • Automated annotation of sequences
  • DNA microarrays
  • gene expression (measure RNA levels)
  • SNP Genotyping
  • Genome diagnostics (genetic testing)
  • Proteomics
  • Protein identification
  • Protein-protein interactions

37
DNA chip microarrays
  • Put a large number (100K) of cDNA sequences or
    synthetic DNA oligomers onto a glass slide (or
    other substrate) in known locations on a grid.
  • Label an RNA sample and hybridize
  • Measure amounts of RNA bound to each square in
    the grid
  • Make comparisons
  • Cancerous vs. normal tissue
  • Treated vs. untreated
  • Time course
  • Many applications in both basic and clinical
    research

38
Goal of Microarray experiments
  • Microarrays are a very good way of identifying a
    bunch of genes involved in a disease process
  • Differences between cancer and normal tissue
  • Tuberculosis infected vs resistant lung cells
  • Mapping out a pathway
  • Co-regulated genes
  • Finding function for unknown genes
  • Involved these processes

39
Competing Microarray Technologies
  • Affymetrix Gene chip system
  • Uses 25 base oligos synthesized in place on a
    chip
  • Can have as many as 20,000 genes on a chip
  • Arrays get smaller every year (more genes)
  • Chips are very expensive
  • Proprietary system black box software, can
    only use their chips
  • cDNA spotting technology
  • Multiple vendors, or make your own
  • Can buy chips, complete systems, or contract
    services (Incyte)
  • Hundreds to a few thousands of genes per chip
  • More sensitive, but less specific than Affymetrix
    system
  • Oligonucleotides
  • Nylon Filters

40
(No Transcript)
41
(No Transcript)
42
cDNA spotted microarrays
43
(No Transcript)
44
Direct Medical Applications
  • Diagnosis
  • Type of cancer
  • Aggressive or benign?
  • Monitor treatment outcome
  • Is a treatment having the desired effect on the
    target tissue?

45
Human Genetic Variation
  • Every human has essentially the same set of genes
  • But there are different forms of each gene --
    known as alleles
  • blue vs. brown eyes
  • genetic diseases such as cystic fibrosis or
    Huntingtons disease are caused by dysfunctional
    alleles

46
  • Alleles are created by mutations in the DNA
    sequence of one person - which are passed on to
    their descendants

47
Effects of Mutations
  • Mutations occur randomly throughout the DNA
  • Most have no phenotypic effect (non-coding
    regions, equivalent codons, similar AAs)
  • Some damage the function of a protein or
    regulatory element
  • A very few provide an evolutionary advantage

48
Human Alleles
  • The OMIM (Online Mendelian Inheritance in Man)
    database at the NCBI tracks all human mutations
    with known pheontypes.
  • It contains a total of about 2,000 genetic
    diseases and another 11,000 genetic loci with
    known phenotypes - but not necessarily known gene
    sequences
  • It is designed for use by physicians
  • can search by disease name
  • contains summaries from clinical studies

49
(No Transcript)
50
Clinical Manifestationsof Genetic Variation
  • (All disease has a genetic component)
  • Susceptibility vs. resistance
  • Variations in disease severity or symptoms
  • Reaction to drugs (pharmacogenetics)
  • All of these traits can be traced back to
    particular genes (or sets of genes) but we don't
    know these associations yet.

51
So Whats a SNP
  • A mutation that causes a single base change is
    known as a Single Nucleotide Polymorphism (SNP)
  • SNPs are very common in the human population (one
    SNP every 1250 bases)
  • there are SNPs located near all genes
  • they can be used as markers
  • Most of these have no visible effect
  • in regions between genes

52
Genome Sequencing find SNPs
53
SNP Genotyping
  • SNPs are a form of mutation that can be used to
    measure genetic differences in a high-throughput
    fashion.
  • A genomics approach to genetic testing
  • Lots of room for bio-technology innovation
  • Allele-specific PCR
  • Site specific sequencing
  • Genotyping microarray chips

54
SNP Genotyping
  • It is possible to measure many thousands of SNPs
    simultaneously in a small blood sample from a
    patient
  • Can compare genotypes for SNP markers linked to
    virtually any trait
  • A human genome can be characterized with a few
    thousand common SNP markers
  • on a single chip
  • a personal genetic profile

55
Some Diseases Involve Many Genes
  • There are a number of classic genetic diseases
    caused by mutations of a single gene
  • Huntingtons, Cystic Fibrosis, Tay-Sachs, PKU,
    etc.
  • There are also many diseases that are the result
    of the interactions of many genes
  • asthma, heart disease, cancer
  • Each of these genes may be considered to be a
    risk factor for the disease.
  • Groups of genetic markers (SNPs) may be
    associated with disease risk without determining
    a mechanism.

56
DNA Diagnostic Testing
  • Hereditary diseases - potential parents,
    pre-natal, late onset diseases
  • Genes that predisposes to disease (risk factors)
  • Genotyping of infectious agents (bacterial
    viral)
  • Measure the type and stage of cancer tumors
  • Forensics - using DNA testing to establish
    identity

57
III. The Medical and Business Applications of
Genomics
58
Implications for Biomedicine
  • Physicians will use genetic information to
    diagnose and treat disease.
  • Virtually all medical conditions have a genetic
    component.
  • Faster drug development research
  • Individualized drugs
  • Gene therapy
  • All Biologists will use gene sequence information
    in their daily work

59
Pharmacogenomics
  • The use of DNA sequence information to measure
    and predict the reaction of individuals to drugs
  • Personalized drugs
  • Faster clinical trials
  • Less drug side effects

60
People React Differently to Drugs
  • Side effects
  • Effectiveness
  • There are genes that control these reactions
  • SNP markers can be used to identify these genes

61

Make Genetic Profiles
  • Identify populations of people who show specific
    responses to a drug
  • Scan these populations with a large number of SNP
    markers.
  • Find markers linked to drug response phenotypes.

62
Use the Profiles
  • Genetic profiles of new patients can then be used
    to prescribe drugs more effectively avoid
    adverse reactions.
  • Sell a drug with a gene test
  • Can also speed clinical trials by testing on
    those who are likely to respond well.

63
Toxicogenomics
  • There are a number of common pathways for drug
    toxicity (or environmental tox.)
  • It is possible to compile genomic signatures
    (gene expression data) for these pathways.
  • Candidate drug molecules can be screened in cell
    culture or in animals for induction of these
    toxicity pathways.

64
Genomics supports Biotechnology
  • Biotechnology is based on developing new drugs
  • Some Biotech companies produce and sell these
    drugs (Amgen, Genentech), while others partner
    with big pharmaceutical companies (sell
    intellectual property)
  • Genomics is a way of using information to find
    new drugs faster and more cheaply.

65
Tools vs. Targets
  • Genomics/Bioinformatics companies can sell
    information and software (tools) or the results
    of genome analysis (targets)
  • Tools are bought by big Biotech or Pharma
    companies to aid their own research.
  • Targets are proteins that have already been
    identified as playing a role in disease
  • ready for drug development

66
Tools can be Software or Technology
  • Database, data analysis, data mining, and
    interface software is essential
  • Machines and reagents for genomics experiments
    (GeneChips, gene testing machines)
  • Some tools will have a mainstream application in
    medicine (diagnostic tests)
  • a much wider market

67
Impact on Bioinformatics
  • Genomics produces high-throughput, high-quality
    data, and bioinformatics provides the analysis
    and interpretation of these massive data sets.
  • It is impossible to separate genomics laboratory
    technologies from the computational tools
    required for data analysis.
  • Investment in genomics lab technology must
    include funding for bioinformatics support

68
Planning for a Genomics Revolution
  • Bioinformatics support must be integral in the
    planning process for the development of new
    genomics research facilities.
  • Genome Project sequencing centers have more staff
    and more spent on data analysis than on the
    sequencing itself.
  • Microarray facilities will be even more skewed
    toward data analysis
  • It is an information-intensive business!

69
Long Term Implications
  • A "periodic table for biology" will lead to an
    explosion of research and discoveries - we will
    finally have the tools to start making systematic
    analyses of biological processes (quantitative
    biology).
  • Understanding the genome will lead to the
    ability to change it - to modify the
    characteristics of organisms and people in a wide
    variety of ways

70
Genomics Education
  • Genomics scientists need basic training in both
    Molecular Biology and Computing
  • Specific training in the use of automated
    laboratory equipment, the analysis of large
    datasets, and bioinformatics algorithms
  • Particularly important for the training of
    medical doctors - at least a familiarity with the
    technology

71
Genomics in Medical Education
  • The explosion of information about the new
    genetics will create a huge problem in health
    education. Most physicians in practice have had
    not a single hour of education in genetics and
    are going to be severely challenged to pick up
    this new technology and run with it."
  • Francis Collins

72
Stuart M. Brown, Ph.D.stuart.brown_at_med.nyu.eduww
w.med.nyu/rcr
Bioinformatics A Biologist's Guide to
Biocomputing and the Internet
Write a Comment
User Comments (0)
About PowerShow.com