All Systems Go proteomics, microarrays and biomarkers - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

All Systems Go proteomics, microarrays and biomarkers

Description:

Howard Edenberg, Samiran Ghosh, Dan Li, Xiaoman Li, Lang Li, ... Conclusion: Lake monroe has two times yellow fish than blue fish. Tang et al. Bioinformatics. ... – PowerPoint PPT presentation

Number of Views:276
Avg rating:3.0/5.0
Slides: 19
Provided by: yunlo
Category:

less

Transcript and Presenter's Notes

Title: All Systems Go proteomics, microarrays and biomarkers


1
All Systems Go! proteomics, microarrays and
biomarkers
  • Howard Edenberg, Samiran Ghosh, Dan Li, Xiaoman
    Li, Lang Li,
  • Yunlong Liu, Malika Mahoui, Jeanette McClintick,
  • Predrag Radivojac, Pedro Romero, Changyu Shen,
  • Haixu Tang
  • 2007 Indiana University Computational Biology and
    Bioinformatics Retreat

2
Our Group
Howard Edenberg
Xiaoman Li
Predrag Radivojac
Chancellor Prof. Biochemistry Ctr for Med
Genomics IUSM
Assistant Prof. Biostatistics/CCBB IUSM
Assistant Prof. School of Informatics IUB
Samiran Gosh
Yunlong Liu
Pedro Romero
Assistant Prof. Biostatistics/CCBB Ctr for Med
Genomics IUSM
Assistant Prof. Math. Dept. IUPUI
Assistant Prof. School of Informatics IUPUI
Shuyu Dan Li
Malika Mahoui
Changyu Shen
Sr. Scientist Informatics Eli Lilly and Comp
Assistant Prof. School of Informatics IUPUI
Assistant Prof. Biostatistics IUSM
Lang Li
Jeanette McClintick
Haixu Tang
Assistant Prof. Biochemistry Ctr for Med
Genomics IUSM
Associate Prof. Biostatistics IUSM
Assistant Prof. School of Informatics IUB
3
Our group
Institution
Title
4
Outline
  • Overview of the field
  • Important topics
  • Highlights of the research contributions from our
    group
  • Potential collaboration opportunities
  • Discussion (Haixu Tang)

5
Fishing expeditions vs. hypothesis-driven
It (the human genome project) was no more than a
big fishing expedition, a mindless factory
project that no scientists in their right minds
would join.
Data- and technology-driven studies are not
alternatives to hypothesis-driven studies, but
are complimentary and iterative partners with
them.
6
Hypothesis/data-driven research
Kitano, Science 2002 Vol. 295. no. 5560, pp.
1662 - 1664
7
Key steps of high-throughput analysis
Biological question
  • Experimental design
  • Biological techniques
  • Sample preparation
  • High-throughput platform selection

Experimental process
  • Quantification and data transformation
  • Normalization
  • Data cleaning
  • Statistical analysis
  • Multiple hypothesis testing

Data extraction
  • Regulatory networks
  • Protein-protein/DNA interaction
  • Networks/pathway analysis
  • Sequence analysis
  • Functional analysis

Data interpretation and modeling
Biological application
8
Important topics in the field
  • Platform selection, evaluation, and analysis
    (McClintick, Li)
  • High dimensional data vs. underpowered experiment
    (Gosh, McClintick)
  • Integration of experimental data and biological
    knowledge to improve detection accuracy (Shen,
    Li)
  • Integration of data from different technologies
    (Tang)
  • Blind vs. targeted biomarker discovery (Tang)
  • Huge amount of data vs. limited knowledge
    (McClintick)
  • Software (McClintick)
  • Data sharing (MIAME standard and GEO database)
    (McClintick)

9
  • highlights of the research
  • contributions from our group

10
SAGE vs. microarray
  • Results
  • Significant discrepancies between the two
    platforms only 30-40 genes exhibited positive
    correlations
  • The discrepancies are not caused by heterogeneity
    of tissue sources, microarray probe designs, mRNA
    abundance, or gene function
  • Reason
  • Errors in SAGE tag annotation
  • Splice variants
  • SAGE tags and array probesets represent different
    regions of the same genes

Li S., Li Y. H., Wei T., Su E. W., Duffin K., and
Liao B. (2006). Biology Direct 1, 33
Shuyu Dan Li Eli Lilly
11
Removing junks from valuables
Affymetrix platform Detection calls for each
probe set Present, Marginal, and Absent
Pre-filtering of microarray data to improve false
discovery rate.
Use of a threshold fraction of Present detection
calls (derived by MAS5) provided a simple method
that effectively eliminated from analysis probe
sets that are unlikely to be reliable while
preserving the most significant probe sets and
those turned on or off it thereby increased the
ratio of true positives to false positives.
Howard Edenberg
McClintick JN, Edenberg HJ. BMC Bioinformatics
2006, 749.
Jeanette McClintick
12
Measuring undetectables?
Go fishing!
Conclusion Lake monroe has two times yellow fish
than blue fish.
  • Peptide detectability
  • Probability of observing a peptide in a standard
    sample
  • An intrinsic property of the peptide sequences.

Predrag Radivojac
Protein abundance Protein detectability
Protein measurement
Tang et al. Bioinformatics. 2006 Jul
1522(14)e481-8.
Haixu Tang
13
Finding partners
  • Using an empirical Bayes model to analyze yeast
    two-hybridization data.
  • Around 1 of the protein pairs are interacting
    partners. Multi-protein pull-down experiment has
    high specificity but mediocre sensitivity
    (50-70)
  • There should be an average of about 20 true
    associations per MPC (multiprotein complex),
    almost 10 times as high as was previously
    estimated.

Changyu Shen
Lang Li
Shen et al., Proteins function, structure, and
bioinformatics, 2006, 64, 436443
14
Sampling motifs from my root
  • Methods finding motifs by using (1)
    overrepresentation and (2) evolutionary
    conservation properties of motifs
  • Contribution
  • Applicable to divergent species where alignment
    is unrealiable
  • Greatly improved prediction accuracy.

Xiaoman Li Biostat, IUSM
Li et al. (2005) PNAS 1029481-6. Li et al.
(2005) PNAS 10216945-0.
15
Finding controllers
Understand how transcription factors work
cooperatively to lead this global gene expression
patterns to emerge.
Quantitative relationship
Howard Edenberg
Liu et al. Genomics. 2006 Oct88(4)452-61.
Yunlong Liu
16
Protein-DNA binding pattern matters
The first set of transcription factor binding
patterns that distinguish the ERa up/down
regulated targets in breast cancer cell lines.
(Li et al., Bioinformatics, 2006, 22 2210-2216)
Lang Li Biostat, IUSM
17
Collaboration with other areas
  • Databases and Datamining
  • Networks and Pathways
  • Proteomics, Microarrays, Biomarkers
  • Structure and Function
  • Machine Learning and Prediction
  • Mutations and Disease
  • Protein Ligand Interactions
  • Cheminformatics and Cyberinfrastructure
  • Academic Matters

18
Collaborations with other areas
Biological question
Experimental process
Data extraction
Machine Learning and Prediction Dataming
Networks and Pathways Structure and Function
Data interpretation and modeling
Mutations and Disease
Protein Ligand Interactions
Biological application
Databases
Write a Comment
User Comments (0)
About PowerShow.com