Title: How can you benefit from the Bioinformatics Resource
1How can you benefit from the Bioinformatics
Resource?
- Can (John) Bruce, Ph.D.
- Associate Director
- Bioinformatics Resource
- Keck Biotechnology Laboratory
2The Bioinformatics Core
- Created within Keck Lab upon request from Yale
School of Medicine, July 2007. - Director Hongyu Zhao Ph.D Associate Directors
Can Bruce, Ph.D. Yong Kong , Ph.D. - The facility is located at Sterling Hall of
Medicine. - Commercial software packages provided free by the
Core are available to Yale researchers 24/7.
3Services
- Access to large number of widely used commercial
and open source bioinformatics programs. - Fee-based consultation services for well defined
bioinformatics analyses. - Collaborative projects requiring longer-term
commitment of time and effort
4Available programs
- DNA/protein sequence analysis Lasergene and
Gene Construction Kit. - Pathway Analysis Ingenuity Pathway Analysis and
MetaCore. - Protein structure modeling Sybyl, a protein
structure modeling and visualization program. - Mass spectrometry data analysis GPMAW.
- Pipelining programs Pipeline Pilot and VIBE
5Examples of Current Collaborations
- Pathway analysis on proteomics data (Yale/NIDA
Proteomics Center Project and Yale/NHLBI
Proteomics Center Project investigators) - Development of an algorithm for identification of
phosphorylation sites from tandem spectrometry
data (E. Gulcicek in Keck Proteomics ) - Molecular modeling of MAP Kinase ligand
interactions (B. Turk in Pharmacology) - Sequence analysis for defining invention claim
for Office of Collaborative Research
6Microarray analysis software
- GeneSpring GX, provides visualization and
advanced statistical analysis for gene expression
data. - Partek Genomics Suite, provides advanced
statistics and interactive data visualization
designed for gene expression analysis, exon
expression analysis, promoter tiling array
analysis, chromosomal copy number analysis, and
SNP analysis.
7Sequence Analysis Software
- DNASTAR Lasergene, a comprehensive suite of
programs for analysis of DNA/RNA/protein
sequences including sequence editing, sequence
assembly, sequence alignment, primer design,
protein structure prediction, and gene detection
and annotation. - Gene Construction Kit 2.5, a tool for designing,
drawing, and annotating DNA sequences especially
plasmid constructs.
8PIPELINING PROGRAMS
This pipeline from Pipeline Pilot takes a
Swiss-Prot sequence, from a Web portal, then
generates a results page with four tabs, giving
summary data, sequence features map, chemical
structures of substrates and blast results.
9PATHWAY ANALYSIS
- MetaCore (from GeneGo),
- Ingenuity Pathways Analysis 3.1 (from Ingenuity
Systems). - Both are integrated software suite for functional
analysis. - Based on a proprietary manually curated database
of human protein-protein, protein-DNA and protein
compound interactions, metabolic and signaling
pathways and the effects of bioactive molecules. - Metacore can be integrated with other software
packages such as Genespring, Resolver,
Expressionist etc. , Pipeline Pilot, EndNote,
Cytoscape. - Ingenuity can be integrated with Genespring,
Partek genomics, SAS-Jump Genomics, Spotfire.
10Why Pathway Analysis?
11Pathway Creation Algorithms in MetaCore (1)
12Direct Interactions Algorithm
Draws direct interactions between selected
objects.No additional objects are added to the
network
13Self regulatory Networks
Finds the shortest directed paths containing
transcription factors between your genes in the
gene list. (better used for small number of
targets)
14Expand by one(not suitable for large collections
of targets)
15Auto expand
Draws sub-networks around the selected objects,
stopping the expansion when the sub-networks
intersect
16Pathway Creation Algorithms in MetaCore (2)
- Analyze Network Creates a list of possible
networks, ranked according to how many objects in
the network correspond to the user's list of
genes, how many nodes are in the network, how
many nodes are in each smaller network. - Analyze Transcription Network similar to above,
sub-networks created are centered on TFs. - Analyze Networks (Transcription Factors) focusses
on presence of TFs at end notes. - Analyze Networks (Receptors) focusses on presence
on Receptors at end point of a network.
17Analyze Network Algorithm
Generates sub-networks highly saturated with
selected objects. Sub-networks are ranked by a
P-value andG-Score and interpreted in terms of
Gene Ontology
A proteomics experiment. Effect of drug
infusion on plasma proteins
Plt1e-18
18Analyze Networks (Transcription Factors)
Algorithm- an example -
Favors netwok construction where the end-nodes of
transcriptionally regulated pathways are present
in the original gene list.
Example from an mRNA expression analysis data set
comparing healthy and lesion skin.
P7.2e-46
19Analyze Network (Receptors) Algorithm- an
example -
Favors network construction where the end-point
of a pathway leads to a receptor (through
receptor binding) and the starting point of
a pathway (a transcription factor, or ligands,
etc) is present in the original gene list,
regardless of the presence of the end-point
receptor in the list.
20Transcription Regulation Algorithm
Generates sub-networks centered on transcription
factors. Sub-networks are ranked by a P-value and
interpreted in terms of Gene Ontology
13 targets/14 nodes P7.3e-31
21Immune response Histamine H1 receptor signaling
in immune response (p1e-4)
22GeneGo process networks
23WNT signaling (p1e-5)
24Disease biomarker enrichment
25Network-disease associations
1) Carcinoma (72 coverage, p3.3e-10) 2)
Neoplasms, connective and soft tissue. (42
coverage, p8e-10)
26Use of Pathway Analysis in Candidate Gene
Identification
FGF2, WNT5A, Tenascin-C, EGF, ILI1RN, BDNF,
TGF-beta2, FGF2, OSF-2, CSPG4(NG2), IL-8,
ENA-78, GCP2, SLIT2, SLIT3, Activin beta A,
Annexin I
1061 genes are located to mapped region for
disease
Other up- or down- regulated genes
360 genes up- or down- regulated by gt2x
17 receptor ligand genes are important input
nodes to pathways formed by genes with changed
expression.
27Pathway analysis narrows down number of candidate
genes for disease
ErbB2 PECAM1 DDX5 BCAS3 microRNA1 RARalpha MUL
VHR WIP ErbB2 NIK Plakoglobin HEXIM1
Prohibitin STAT5A STAT3 Clathrin PSME3 PSMC5 Er
bB2
FGF2, ILI1RN, ErbB2
Other up- or down- regulated genes
360 genes up- or down- regulated by gt2x
These genes, from mapped region of interest, are
able to form interaction pathways going through
these receptor ligands identified by first
analysis.
28A caveat
Not every gene belongs to a pathway in the
database
29Why Pathway Analysis Software?
- A learning tool
- Study a group of gene products.
- A data analysis tool.
- Which pathways are particularly affected?
- What disease has similar biomarkers?
- A hypothesis generation tool
- Can provide insight into mechanism of regulation
of your genes. Which is the likely causative
agent for the observed changes? What is likely to
happen as a result of these changes? - Suggest effects of gene knock-in or knock-outs.
- Suggest side-effects of drugs.
- Can highlight new phenomena that needs further
investigation. What does the program not explain?
30Thank you.