Title: AHM 2002
1AHM 2002
- Tutorial on Scientific Data Mediation
- Example 1
2SCENARIO_Genome-scale Modeling of Low-Dose
Irradiation Responses Using Microarray Based Gene
Networks
- Hypotheses Genes that show similar expression
patterns in response to low-dose irradiation are
components of coordinated expression groups
(called synexpression groups) and that
understanding the differential regulation of
these synexpression groups will (a) provide
DNA-sequenced-based understanding of the complex
biological processes associated with low-dose
radiation and (b) identify determinants of
radiation dose and genetic susceptibility to
radiation damage.
Database search for common promoter elements to
link new candidate genes
Statistical Clustering of genes
Microarray analysis
3SPECIFIC AIMS
- 1. Develop a web-accessible database resource to
assemble microarray transcription profiles of
radiation responsive genes and to link these
genes to genomic and cDNA sequence information. - 2. Apply statistical and bioinformatic tools to
identify novel synergistic gene expression groups
of radiation responsive genes. - 3. Apply the model to the analysis of
gene/pathway responses to low-dose IR.
4Clusfavor
http//mbcr.bcm.tmc.edu/genepi/
wrap1
Accession Number
wrap2.xml
NCBI GeneBank
pwrap1
http//www.ncbi.nlm.nih.gov/
wrap2
Sequence to search
wrap3.xml
BLAST
pwrap2
http//www.ncbi.nlm.nih.gov/blast/
wrap3
The top match
wrap4.xml
http//transfac.gbf.de/cgi- bin/matSearch/matsearc
h.pl
MatInspector
pwrap3
Resulting sequences similarity scores
wrap4
- an external program to build a model or
- back to blast to find additional matches, or
- to clustal to determine a consensus sequence
which is then sent to blast.
SCENARIO WORKFLOW
5CLUSFAVOR
NCBI GeneBank
Database search for promoter identification
Microarray analysis
cDNA Cluster
BLAST
Promoter model
Common promoter alignment
Promoter sequences
TRANSFAC
- New candidate target genes
Database search
Adapted from Thomas Werner Biomolecular
Engineering, 17 87-94 (2001)
6CLUSFAVOR
- CLUSFAVOR- CLUSter and Factor Analysis with
Varimax Orthogonal Rotation - A standalone program whose output consists of
several clusters of named sequences that have
similar expression characteristics in the current
experiment. - GOAL Given a gene expression data, to end up
with another set of related sequences from which
to build a model. - INPUT gene expression data
- OUTPUT collection of clustered cDNA fragments
7NCBI GeneBank
- GOAL Given the name (or, better, the accession
number) of a cDNA string from the clusfavor
results, do a name lookup in GenBank to obtain
the cDNA sequence. - INPUT The accession number or the name of a cDNA
string - OUTPUT cDNA sequence for the input cDNA string
8BLAST
- Basic Local Alignment Search Tool_ BLAST
- A set of similarity search programs designed to
explore all of the available sequence databases
regardless of whether the query is protein or
DNA. - INPUT Output cDNA sequence from GeneBank.
- OUPUT A set of similar sequences.
9MatInspector V2.2 based on TRANSFAC
- MatInspector - Matrix Inspector
- TRANSFAC - The Transcription Factor Database
- Search for potential transcription factor binding
sites in your own sequences and detect consensus
matches in nucleotide sequence data using the
TRANSFAC 4.0 matrices.
10GENEBANK MEDIATION DEMO