Bioinformatics: Definitions, Challenges and Impact on Health Care Systems

1 / 61
About This Presentation
Title:

Bioinformatics: Definitions, Challenges and Impact on Health Care Systems

Description:

Systems Biology (metabolomics) Metabolites and interacting systems (interactomics) ... PLoS Biology. 2004 September; 2 (9): e317 ... –

Number of Views:262
Avg rating:3.0/5.0
Slides: 62
Provided by: mitch46
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics: Definitions, Challenges and Impact on Health Care Systems


1
Bioinformatics Definitions, Challenges and
Impact on Health Care Systems
  • Joyce Mitchell, PhD
  • Professor and Chair
  • Department of Biomedical Informatics
  • University of Utah School of Medicine
  • http//uuhsc.utah.edu/medinfo

2
Topics
  • What is Bioinformatics?
  • Scope of Bioinformatics
  • Genomics
  • Proteomics
  • Functional genomics
  • Genomics data and patient care
  • Impact of Bioinformatics on Health Information
    Systems

3
Central Dogma of Molecular Biology
Transcription
DNA
RNA
Protein
Phenotype
Phenotype
Translation
Post Translational Modification
Replication
4
What is Bioinformatics?
  • Definitions

5
NIH Working Definition
  • Bioinformatics Research, development, or
    application of computational tools and approaches
    for expanding the use of biological, medical,
    behavioral or health data, including those to
    acquire, store, organize, archive, analyze, or
    visualize such data.
  • http//www.bisti.nih.gov/CompuB
    ioDef.pdf

6
AnotherNCBI (National Center for Biotechnology
Information
  • Bioinformatics is the field of science in
    which biology, computer science, and information
    technology merge into a single discipline. The
    ultimate goal of the field is to enable the
    discovery of new biological insights and to
    create a global perspective from which unifying
    principles in biology can be discerned.
  • http//www.ncbi.nlm.nih.gov/About/primer/bioinform
    atics.html

7
Bioinformatics Health Informatics
  • Bioinformatics is the study of the flow of
    information in biological sciences.
  • Health Informatics is the study of the flow of
    information in patient care.
  • These two field are on a collision course as
    genomics data becomes used in patient care.
  • Russ Altman,MD, PhD, Stanford Univ.

8
Scope of Bioinformatics
  • OMES and OMICS

9
Omes and Omics
  • Genomics
  • Primarily sequences (DNA and RNA)
  • Databanks and search algorithms
  • Supports studies of molecular evolution (Tree
    wars)
  • Proteomics
  • Sequences (Protein) and structures
  • Mass spectrometry, X-ray crystallography
  • Databanks, knowledge bases, visualization
  • Functional Genomics (transcriptomics)
  • Microarray data
  • Databanks, analysis tools, controlled
    terminologies
  • Systems Biology (metabolomics)
  • Metabolites and interacting systems
    (interactomics)
  • Graphs, visualization, modeling, networks of
    entities

10
Central Dogma of Molecular Biology
DNA
RNA
Protein
Phenotype
Phenotype
Functional Genomics (Transcriptomics)
Structural Genomics
Phenomics
Proteomics
11
Human Genome Project
  • Human Genome Project - International research
    effort
  • Determine sequence of human genome and other
    model organisms
  • Began 1990, completed 2003
  • Next steps for 20,000 genes
  • Function and regulation of all genes
  • Significance of variations between people
  • Cures, therapies, genomic healthcare

12
Genome and Genomics
  • Genome entire complement of DNA in a species
  • Both nuclear and mitochondrial/chloroplast
  • Variants among individuals
  • Genomics study of the sequence, structure and
    function of the genome. Study relationships
    among sets of genes rather than single genes.
  • Comparative genomics study of the differences
    among species. Usually covers evolutionary
    studies of differences conservation over time.

13
Genome Databases (e.g., GenBank)
  • Consists of
  • long strings of DNA bases ATCG..
  • Annotations of this database to attach meaning to
    the sequence data.
  • Example entry from GenBank
  • http//www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val
    NM_000410doptgb Hemochromatosis gene HFE

14
(No Transcript)
15
The Genome Sequence is at handso?
The good news is that we have the human genome.
The bad news is its just a parts list
16
The Human Genome Project has catalyzed striking
paradigm changes in biology - biology is an
information science.
  • Leroy Hood, MD, PhD
  • Institute for Systems Biology
  • Seattle, Washington

17
Genomes In Public Databases
  • Published complete genomes
  • Ongoing prokaryotic genome projects
  • Ongoing eukaryotic genomes

2700
http//www.genomesonline.org/
18
Genomics activities
  • Sequence the genes and chromosomes done by
    breaking the DNA into parts
  • Map the location of various gene entities to
    establish their order
  • Compare the sequences with other known sequences
    to determine similarity
  • Across species, conserved sequence motifs
  • Predict secondary structure of proteins
  • Create large databases GenBank, EMBL, DDBJ
  • Develop algorithms and similarity measures
  • BLAST and its many forms

19
Central Dogma of Molecular Biology
DNA
RNA
Protein
Phenotype
Phenotype
Genomics
Proteomics
Transcriptomics Functional Genetics
20
Proteome vs Transcriptome
  • Functional genomics (transcriptomics) looks at
    the timing and regulation of gene products (mRNA,
    primarily)
  • Proteome is final end-product (set of many or all
    proteins).
  • Relationship between transcriptome and proteome
    is complex, due to longevity of mRNA signal,
    subsequent control of translation to protein, and
    post translational modifications.

21
Functional Genomics TechnologiesGene Chips,
Microarrays, etc
22
Functional Genomics Microarrays
  • Transcriptome and transcriptomics
  • High throughput technique designed to measure the
    relative abundance of mRNA in a cell or tissue
    in response to an experiment.
  • Also called gene expression analysis

23
GeneChip synthesis
24
  • Structure of a Gene Chip
  • Animation of Gene Chip experiment
  • http//www.affymetrix.com/corporate/outreach/lesso
    n_plan/educator_resources.affx

25
Characteristics of Array Data
  • Voluminous tens of thousands of variables with
    relatively few observations of each (upside down
    vs. classical biostatistics)
  • Noisy error rates up to 8
  • Methods designed to detect patterns and
    associations always find patterns and
    associations

26
Experimental Design
  • A fundamental challenge of microarray
    experiments underdetermined systems

Kohane IS, Kho AT, Butte AJ. Microarrays for an
Integrative Genomics. (The MIT Press Cambridge,
MA 2003), p. 11.
27
(No Transcript)
28
Uses of Expression Profiling
  • Pharmaceutical research
  • ID drug targets by comparing expression profile
    of drug-treated cells with those of cells
    containing mutations in genes encoding known drug
    targets
  • Disease Dx and Tx
  • Distinguish morphologically similar cancers
  • DLBCL (Poulsen et al (2005) Microarray-based
    classification of diffuse large B-cell lymphomas
    European Journal of Haematology 74(6)453-65.))
  • Therapy potential
  • Rabson AB, Weissmann D. From microarray to
    bedside targeting NF-kappaB for therapy of
    lymphomas. Clin Cancer Res. 2005 Jan 111(1)2-6.

29
Recent Applications
  • Diagnostic tool to screen for infective agents
  • Chip imprinted with set of pathogenic genomes
    used to identify bacterial, viral, or parasite
    genomic material in patients body fluids
  • Diagnostic chip to check for mutations involved
    in drug-gene interactions.
  • Roche Amplichip

30
Public Microarray Data Repositories
  • Major public repositories
  • GEO (NCBI)
  • http//www.ncbi.nlm.nih.gov/geo/
  • ArrayExpress (EBI)
  • http//www.ebi.ac.uk/arrayexpress/

31
Standards and Repositories
  • Brazma, A, et al. Minimum information about a
    microarray experiment (MIAME)-toward standards
    for microarray data. Nature Genetics. 2001
    Dec29(4)373.
  • http//www.nature.com/cgi-taf/DynaPage.taf?file/
    ng/journal/v29/n4/full/ng1201-365.html
  • Ball, CA, et al. Submission of Microarray Data to
    Public Repositories. PLoS Biology. 2004
    September 2 (9) e317
  • http//www.pubmedcentral.nih.gov/articlerender.fc
    gi?toolpubmedpubmedid15340489

32
Central Dogma of Molecular Biology
DNA
RNA
Protein
Phenotype
Phenotype Tissues Organs Organisms
Genomics
Proteomics
Transcriptomics Functional Genetics
33
Proteome and Proteomics
  • Proteome the entire set of proteins (and other
    gene products) made by the genome.
  • Proteomics study of the interactions among
    proteins in the proteome, including networks of
    interacting proteins and metabolic
    considerations. Also includes differences in
    developmental stages, tissues and organs.

34
Protein Functions
  • Catalysis
  • Transport
  • Nutrition and storage
  • Contraction and mobility
  • Structural elements
  • Cytoskeleton
  • Basement membranes
  • Defense mechanisms
  • Regulation
  • Genetic
  • Hormonal
  • Buffering capacity

35
Protein Databases
  • SwissProt
  • PIR
    http//www.pir.uniprot.org/
  • GENE http//www.ncbi.nlm.nih.gov/gene
  • InterPro http//www.ebi.ac.uk/interpro/
  • Correspond to (and derived from) Genome data
    bases
  • All connected by Reference Sequences (NCBI)

UniProt
36
Gene/Protein Database entries
  • HFE record in Entrez GENE (NCBI)
  • http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?db
    genecmdretrievedoptGraphicslist_uids3077

37
Structure Function Determination
  • X-ray crystallography
  • Nuclear magnetic resonance spectroscopy and
    tandem MS/MS
  • Computational modeling
  • Sequence alignment from others
  • Homology modeling

38
Structure Databases
  • Contain experimentally determined and predicted
    structures of biological molecules
  • Most structures determined by X-ray
    crystallography, NMR
  • Example MMDB molecular modeling db
    http//www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.sh
    tml
  • HFE Entry
  • http//www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv
    .cgi?form6dbtDoptsuid9816

39
Protein Interaction Databases
  • Record observations of protein-protein
    interactions in cells
  • Attempts to detail interactions observed in
    thousands of small-scale experiments described in
    published articles
  • Examples
  • BIND Biomolecular Interaction Network Database
  • DIP Database of Interacting Proteins
  • MIPS Munich Information Center for Protein
    Sequences
  • PRONET Protein interaction on the Web
  • Many others, both academic and commercial

40
Controlled Vocabularies in Bioinformatics
  • The Gene Ontology http//www.geneontology.org/
  • Knowledge about gene function (the ontology
    itself)
  • Annotation of gene products (for comparisons)
  • The MGED Ontology (arising from MIAME)
  • http//mged.sourceforge.net/
  • Annotation of microarray experiments for public
    repositories
  • Clinical Bioinformatics Ontology
  • Annotation of gene tests in electronic medical
    records
  • http//www.cerner.com/cbo
  • MIAPE from Proteomics Standards Initiative (PSI)
  • Annotation of proteomics experiments for public
    repositories
  • http//psidev.sourceforge.net/

41
Genomics Data and Patient Care
  • From genotype to phenotype

42
Human Disease Gene Specifics
  • Genes linked to human diseases (9-2004)
  • 425 in 2 yrs
  • 1700/20,000 9 of loci

43
Informatics Issues related to Genomics Data and
Patient Care
  • Linking known data for genes causing human
    diseases to clinical decision support and EMR
    documentation
  • Representation of genetic data in electronic
    medical records

44
Common Questions
  • What genes cause the condition?
  • What are the normal function of the gene?
  • What mutations have been linked to diseases?
  • How does the mutation alter gene function?
  • What laboratories are performing DNA tests?
  • Are there gene therapies or clinical trials?
  • What names are used to refer to the genes and the
    diseases?
  • What other conditions are linked to these same
    genes?

45
Answers exist online
  • but it is not easy answers in many places
  • Cant navigate by genes names - must use hot
    links and numeric identifiers
  • The number and function of alternate forms of the
    protein are inconsistently reported
  • Synonymy (many names, same meaning) and polysemy
    (same name, different meanings) cause confusion
  • Upper and lower case are used for species
    distinctions

46
Major Challenges of Navigation
  • Complexity of data
  • Dynamic nature of the data
  • Diverse foci and number of data/knowledge base
    systems
  • Data and knowledge representation lack standards
  • Can navigate if you know what you are looking for.

47
Genetics Home Reference
  • Consumer health resource to help the public
    navigate from phenotype to genotype.
  • Focus on health implications of the Human Genome
    Project.
  • http//ghr.nlm.nih.gov
  • Mitchell, Fun, McCray, JAMIA, 2004 Nov
    11(6)439-437

48
Genetics is Impacting Medicine Today
  • 2000 genes health conditions
  • gt 1500 gene tests for diagnosis
  • Relate to diagnosis, therapy, drug dosage,
    occupational hazards, reproductive plans, health
    risks, .
  • And direct to consumer genetic test marketing
    (23andMe, navigenetics, )

49
Well-known Examples (germ line)
  • Pharmacogenetics
  • CYP450 alleles exaggerated, diminished or
    ultra-rapid drug responses. E.G., Warfarin. 93
    of patients are OK on standard doses. 7 of
    patients have severe hemorrhage. CYP2C92 and
    CYP2C93 most severe of 6 known mutations.
  • Environmental susceptibility
  • Sickle Cell trait carrier and malaria parasite
  • Nutrition
  • PKU and avoidance of phenylalanine

50
Example (somatic mutation)Iressa (gefitinib)
erlotinib
  • Non-small cell lung CA 140,000 pt/yr
  • Iressa (Astra Zeneca) causes remission in 1 of 10
    patients. Newer drug is erlotinib
  • Iressa erlotinib efficacy correlates with EGFR
    mutation in the tumor. Now have gene testing for
    EGFR so can target appropriate people.
    http//www.sciencemag.org/cgi/content/full/305/568
    8/1222a

51
Implications for Health Care System
  • More gene tests will be ordered. reports of 300
    increase in gene tests in 2003.
  • Arch Pathol Lab Med 2004, 128(12)1330-1333
  • The FDA will regulate panels of tests.
  • http//www.fda.gov/bbs/topics/news/2004/new01149.h
    tml
  • Non-discrimination laws for insurance and
    employment will open a floodgate. GINA
  • Preventive healthcare will play a larger part.
  • Environmental risk factors dictate OSHA-type
    approach to worker empowerment and education
    about safe behavior

52
Unsolved Informatics IssuesWhat Should Be
Stored in the EMR?
  • Complete DNA sequence for specific genes into the
    EMR? Where?
  • Microarray and gene chip data?
  • Meta-data about the DNA sequence arrays?
  • If not the sequence (ie., diff from reference
    sequence), what to do when the reference sequence
    changes? Or gene chip changes?
  • How to trigger alerts and reminders? And for
    what?

53
Genetic data in electronic medical records
  • Implications for component systems
  • Laboratory
  • Pharmacy
  • Computerized order entry
  • Documentation and notes
  • Knowledge management
  • Alerts and reminders
  • Finding patients matching profiles
  • Practice guidelines and clinical trials
  • Appropriate therapies and medications

54
Genome Data and Other Information Systems
  • Genomic information will be pervasive in all
    healthcare information systems.
  • Also in public health systems
  • Newborn screening
  • Tissue and organ banks
  • DOD requires DNA samples
  • Bioterrorism and homeland security
  • Identification of World Trade Center victims
  • Privacy and security issues are important but not
    inherently different than other EMR data.

55
Summary
  • Informatics will be the key enabling technology
    for personalized, genomic medicine.
  • Current separation between bioinformatics and
    clinical informatics will diminish as the two
    subdisciplines merge

56
Optional ExerciseHands-on with GHR
  • Scavenger hunt with hemochromatosis and the genes
    that influence it.
  • Explore the Genetics Home Reference by answering
    the following questions. Start at
    http//ghr.nlm.nih.gov .

57
GHR Scavenger Hunt
  • How common is hemochromatosis?
  • How many genes have been proven to be involved in
    hemochromatosis when the genes are mutated?
  • What are the symbols for these genes?
  • Can you find the link to MedlinePlus with health
    information on hemochromatosis?

58
GHR Scavenger Hunt
  • What are the names of the patient support
    associations for hemochromatosis?
  • One synonym for this condition is bronze
    diabetes. Can you find a reason for this?
  • What kind of damage is done to the liver of
    people with hemochromatosis?

59
GHR Scavenger Hunt
  • For the genes involved in hemochromatosis, how
    many of them are available as a DNA test?
  • Give one place where you would choose to send a
    tissue sample for DNA testing.
  • What sites are listed under Research Resources
    for the TFR2 gene?
  • How many alternately spliced proteins for TFR2?
  • In what tissues is this gene expressed?

60
GHR Scavenger Hunt
  • How do people inherit hemochromatosis?
  • Do the genes involved in hemochromatosis cause
    other health conditions when they are mutated?
  • Can you find a protein sequence for one of the
    genes?
  • What clinical trials are available for
    hemochromatosis patients close to where you live?

61
Questions to
  • Joyce Mitchell
  • Joyce.mitchell_at_hsc.utah.edu
  • http//uuhsc.utah.edu/medinfo
Write a Comment
User Comments (0)
About PowerShow.com