Analysis: Discovery of possible regulatory motifs - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Analysis: Discovery of possible regulatory motifs

Description:

... suppose you proceed in a like fashion through the rest of the list. Press. ... To do this, click , then make some room by clicking on Show graphic. Display set ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 35
Provided by: wanlin8
Category:

less

Transcript and Presenter's Notes

Title: Analysis: Discovery of possible regulatory motifs


1
Scenario 5
Analysis Discovery of possible regulatory motifs
What follows is a simulation of the proposed
graphical interface. As you go through the
simulation please consider what capabilities you
would want to serve your research and annotation
interests. A narrative to help you go through the
simulation appears in a red-bordered box, such as
the one below.
To begin1. Click on Slide Show, (on the upper
toolbar)2. Click View Show3. Click Continue
button
Continue
2
Scenario 5
Analysis Discovery of possible regulatory motifs
Youve decided you want to know what regulates
the expression of nif genes, encoding the
machinery for nitrogen fixation. Heres your
strategy
  • Collect nif genes from Anabaena PCC 7120 into set
  • Include in set orthologs of the Anabaena genes
  • Extract 5 sequences from all genes in set
  • Analyze set of 5 sequences for motifs
  • (Search for other genes with same motifs)

Continue
3
Build set
Display set
Modify set
Set operation
Click on Build Set to begin finding orfs with the
desired specifications
4
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
All items in
All open reading frames of All amino acid
sequences of All intergenic regions
of Human-annotated orfs of Private set Public set
All open reading frames of
The first goal is to find all open reading frames
within Prochlorococcus annotated as nif genes, so
click on All open reading frames in
5
Display set
Modify set
Set operation
Cancel
Build set
Choose set type
Choose database
All items in
All open reading frames of
Arthrobacter platensis Gloeobacter
violaceus Microcystis aeruginosa Nostoc
punctiforme Nostoc PCC 7120 Prochlorococcus
MED4 Prochlorococcus MIT9313 Prochlorococcus
S120 Synechococcus PCC6301 Synechococcus
PCC7942 Synechococcus WH Synechocystis PCC
6803 Thermosynechococcus Trichodesmium Unicellulul
ar Filamentous All
Anabaena PCC 7120
Click on Anabaena PCC 7120
6
Display set
Modify set
Set operation
Cancel
Build set
Variable
Data
Operation
Function
Done
Choose database
Choose set type
such that
All items in
Anabaena PCC 7120
All open reading frames of
You want to compare the description of each orf
with nif. To get a tool to extract the
description, click on .
Function
7
Display set
Modify set
Set operation
Cancel
Build set
Variable
Data
Operation
Function
Done
Choose database
Choose set type
such that
All items in
Anabaena PCC 7120
All open reading frames of
Choose function
(item
Closest ortholog of Protein product of Upstream
region of Downstream region of Description
of Category of Annotation level of
Description of
Click on Description of.
8
Display set
Modify set
Set operation
Cancel
Build set
Variable
Data
Operation
Function
Done
Choose database
Choose set type
such that
All items in
Anabaena PCC 7120
All open reading frames of
Choose function
Op
(item)
? includes excludes
Description of
includes
You want to find orfs whose description includes
the word nif. Click on includes.
9
Display set
Modify set
Set operation
Cancel
Build set
Data
Operation
Function
Done
Choose database
Choose set type
such that
All items in
Anabaena PCC 7120
All open reading frames of
Type description term(s)
Op
Choose function
(item)
includes
nif
Description of
You can type in any characters to search for. For
this simulation, the term nif is provided.
Press the Enter key
10
Display set
Modify set
Set operation
Cancel
Build set
Variable
Data
Operation
Function
Done
Choose database
Choose set type
such that
All items in
Anabaena PCC 7120
All open reading frames of
Type description term(s)
Op
Choose function
(item)
includes
nif
Description of
No more specifications. Press the Done button.
11
Display set
Modify set
Set operation
Cancel
Build set
Variable
Data
Operation
Function
Done
Done
Save results and scriptSave only results
Choose database
Choose set type
Save only results
such that
All items in
Anabaena PCC 7120
All open reading frames of
Type description term(s)
Op
Choose function
(item)
includes
nif
Description of
If this were a complicated search, you might want
to save the specifications as a script. In this
case, just save the results by clicking on Save
only results.
12
Display set
Modify set
Set operation
Cancel
Build set
Variable
Data
Operation
Function
Done
Choose database
Choose set type
such that
All items in
Anabaena PCC 7120
All open reading frames of
Type description term(s)
Op
Choose function
(item)
includes
nif
Description of
Type name of set
7120 nif genes
All orfs of Anabaena whose descriptions include
nif will be collected into a set. You can name
the set anything you want. For this simulation, a
name is provided. Press the Enter key.
13
Build set
Display set
Modify set
Set operation
Done
Set 7120 nif genes
Anab7120all0687 hupL NiFe uptake
hydrogenase large subunit, C terminus
Anab7120all0687 hupL NiFe uptake
hydrogenase large subunit, N terminus
Anab7120all0688 hupS NiFe uptake
hydrogenase small subunit
Anab7120alr0692 similar to nifU
Anab7120alr0874 nifH2 dinitrogenase reductase
Anab7120asr1309 similar to nifU
Anab7120alr1407 nifV1 homocitrate synthase
Anab7120asr1408 nifZ iron-sulfur cofactor
synthesis
Anab7120asr1409 nifT
ltlt more items gtgt
This is the result of the search. The set is
displayed both as a list of orfs and a graphical
representation of the genetic neighborhood of
each orf. You can find out more about an orf by
clicking its name or its arrow. For now, just
press .
Continue
Continue
14
Build set
Display set
Modify set
Set operation
Done
Set 7120 nif genes
Anab7120all0687 hupL NiFe uptake
hydrogenase large subunit, C terminus
Anab7120all0687 hupL NiFe uptake
hydrogenase large subunit, N terminus
Anab7120all0688 hupS NiFe uptake
hydrogenase small subunit
Anab7120alr0692 similar to nifU
Anab7120alr0874 nifH2 dinitrogenase reductase
Anab7120asr1309 similar to nifU
Anab7120alr1407 nifV1 homocitrate synthase
Anab7120asr1408 nifZ iron-sulfur cofactor
synthesis
Anab7120asr1409 nifT
ltlt more items gtgt
This search, like most, is only a beginning. It
brought up some unintended hits (nif found
NiFe). More seriously, it brought up many genes
probably in the middle of operons and unlikely to
be preceded by regulatory motifs. The genetic
neighborhood gives clues as to operon structure.
Select the two most likely orfs to begin operons
by clicking on the circles next to alr0874 and
alr1407.
15
Build set
Display set
Modify set
Set operation
Done
Set 7120 nif genes
Anab7120all0687 hupL NiFe uptake
hydrogenase large subunit, C terminus
Anab7120all0687 hupL NiFe uptake
hydrogenase large subunit, N terminus
Anab7120all0688 hupS NiFe uptake
hydrogenase small subunit
Anab7120alr0692 similar to nifU
Anab7120alr0874 nifH2 dinitrogenase reductase
Anab7120asr1309 similar to nifU
Anab7120alr1407 nifV1 homocitrate synthase
Anab7120asr1408 nifZ iron-sulfur cofactor
synthesis
Anab7120asr1409 nifT
ltlt more items gtgt
Lets suppose you proceed in a like fashion
through the rest of the list. Press
.
Done
16
Build set
Display set
Modify set
Set operation
Done
Set 7120 nif genes
Anab7120alr0874 nifH2 dinitrogenase reductase
Anab7120alr1407 nifV1 homocitrate synthase
Anab7120all1438 nifE nitrogenase Fe/Mo
cofactor
Anab7120all1455 nifH dinitrogenase reductase
Anab7120all1517 nifB nitrogen fixation
protein
Anab7120alr2968 nifV2 homocitrate synthase
The set now consists of the six Anabaena nif
genes that you judged most likely to be preceded
by transcriptional signals. It might be
interesting to see where this set is located on
the genome. To do this, click
, then make some room by clicking on Show
graphic.
Display set
17
Build set
Display set
Modify set
Set operation
Done
Set 7120 nif genes
Anab7120alr0874 nifH2 dinitrogenase reductase
Anab7120alr1407 nifV1 homocitrate synthase
Anab7120all1438 nifE nitrogenase Fe/Mo
cofactor
Anab7120all1455 nifH dinitrogenase reductase
Anab7120all1517 nifB nitrogen fixation
protein
Anab7120alr2968 nifV2 homocitrate synthase
Replace the space-consuming description with
coordinates by clicking on Show description, and
then click Show coordinates and finally Show map.
18
Build set
Display set
Modify set
Set operation
Done
Set 7120 nif genes
Anab7120alr0874 nifH2
Anab7120alr1407 nifV1
Anab7120all1438 nifE
Anab7120all1455 nifH
Anab7120all1517 nifB
Anab7120alr2968 nifV2
Replace the space-consuming description with
coordinates by clicking on Show description, and
then click Show coordinates and finally Show map.
19
Build set
Display set
Modify set
Set operation
Done
Set 7120 nif genes
Anab7120alr0874 nifH2 1008496 -gt 1009389
Anab7120alr1407 nifV1 1671878 -gt 1673011
Anab7120all1438 nifE 1696389 lt- 1697831
Anab7120all1455 nifH 1713396 lt- 1714283
Anab7120all1517 nifB 1776670 lt- 1778097
Anab7120alr2968 nifV2 3609625 -gt 3611012
Replace the space-consuming description with
coordinates by clicking on Show description and
then Show coordinates, and finally, click on Show
map.
20
Build set
Display set
Modify set
Set operation
Done
Set 7120 nif genes
Anab7120alr0874 nifH2 1008496 -gt 1009389
Anab7120alr1407 nifV1 1671878 -gt 1673011
Anab7120all1438 nifE 1696389 lt- 1697831
Anab7120all1455 nifH 1713396 lt- 1714283
Anab7120all1517 nifB 1776670 lt- 1778097
Anab7120alr2968 nifV2 3609625 -gt 3611012
Anabaenachromosome 6413771 bp
Four of the six putative nif operons are
clustered near 1.7 Mb... but back to business.
Our idea was to extend the set to include
orthologs in other nitrogen-fixing cyanobacteria.
To do this, click ,
then Transformations, then Ortholog of.
Set operation
21
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
Orthologs of (
All open reading frames of All amino acid
sequences of All intergenic regions
of Human-annotated orfs of Public set Private set
Private set
You want the orthologs of the orfs in the set you
just made. This set is yours a private set as
opposed to certain sets that are available to all
users. Click Private set.
22
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
Choose set
Orthologs of (
Private set
7120 IS895 seqs 7120 nif genes 7120 STTR7
regions Light-specific genes Npun STTR7 regions
7120 nif genes
The list of choices will consist of whatever sets
you may have created. Choose the one you just
made 7120 nif genes.
23
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
Choose set
Choose database
Orthologs of (
in
)
Private set
7120 nif genes
Arthrobacter platensis Gloeobacter
violaceus Microcystis aeruginosa Nostoc
punctiforme Anabaena PCC 7120 Prochlorococcus
MED4 Prochlorococcus MIT9313 Prochlorococcus
S120 Synechococcus PCC6301 Synechococcus
PCC7942 Synechococcus WH8102 Synechocystis PCC
6803 Thermosynechococcus Trichodesmium
erythreum Unicellulular Filamentous All
At present, the set of filamentous cyanobacteria
include just the nitrogen-fixing strains Nostoc
punctiforme, Trichodesmium erythreum, Anabaena.
Click on filamentous.
filamentous
24
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
Choose set
Choose database
Orthologs of (
in
)
Private set
7120 nif genes
Filamentous
Type name of set
all nif genes
All orthologs of the selected nif genes will be
combined and saved in a set of your choice. For
this simulation, a name is provided. Press the
Enter key.
25
Build set
Display set
Modify set
Set operation
Done
Set all nif genes
Anab7120alr0874 nifH2 dinitrogenase reductase
Anab7120alr1407 nifV1 homocitrate synthase
Anab7120all1438 nifE nitrogenase Fe/Mo
cofactor
Anab7120all1455 nifH dinitrogenase reductase
Anab7120all1517 nifB nitrogen fixation
protein
Anab7120alr2968 nifV2 homocitrate synthase
NostPunc637.025 nifH2 dinitrogenase reductase
NostPunc510.011 nifV1 homocitrate synthase
NostPunc651.072 nifE nitrogenase Fe/Mo
cofactor
NostPunc510.021 nifB nitrogen fixation
protein
ltlt more items gtgt
The set now consists of nif genes from all
filamentous cyanobacteria. From this set we want
to extract the upstream sequences. Click on
,then click on Transformations
and Upstream region of.
Set operation
26
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
Upstream region of (
All open reading frames of Human-annotated orfs
of Public set Private set
Private set
Again you want the orfs from a set you made
yourself, so click on Private set.
27
Build set
Display set
Modify set
Set operation
Cancel
Choose set
Choose set type
Upstream region of (
)
7120 IS895 seqs 7120 nif genes 7120 STTR7
regions all nif genes Light-specific genes Npun
STTR7 regions
Private set
all nif genes
The set you just defined magically appears on the
list (no chance for misspelling). Click on it.
28
Build set
Display set
Modify set
Set operation
Cancel
Choose set
Choose set type
Upstream region of (
)
all nif genes
Private set
Type name of set
all nif genes 5
Give this new set of 5 regions a descriptive
name (done here for you). Press the Enter key.
29
Build set
Display set
Modify set
Set operation
Done
Set all nif genes 5
Anab7120.C1006982-1008496d
Anab7120.C1671462-1671878d
Anab7120.C1697832-1698138c
Anab7120.C1713264-1713395c
Anab7120.C1778098-1779034c
Anab7120.C3609273-3609624d
NostPunc.63737288-37376d
NostPunc.51015955-16325d
NostPunc.65160311-60584c
NostPunc.5105239-6338c
ltlt more items gtgt
The resulting set consists of sequences not orfs,
and so the elements are defined by coordinates.
Clicking on a coordinate brings up the sequence
display (see Scenario 6). Clicking on a graph of
an orf brings up the orfs annotation page. Click
.
Continue
Continue
30
Build set
Display set
Modify set
Set operation
Done
Set all nif genes 5
Anab7120.C1006982-1008496d
Anab7120.C1671462-1671878d
Anab7120.C1697832-1698138c
Anab7120.C1713264-1713395c
Anab7120.C1778098-1779034c
Anab7120.C3609273-3609624d
NostPunc.63737288-37376d
NostPunc.51015955-16325d
NostPunc.65160311-60584c
NostPunc.5105239-6338c
ltlt more items gtgt
The final step in this procedure is to analyze
the set of upstream sequences of nif genes hoping
to find a common motif. Click on Set operatio
, then Analysis tools. Tools based on
Position-Specific Scoring Matrices (PSSMs) are
most often used for the task. Click on one of
these Meme.
Set operation
31
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
PSSM Meme of (
Public set Private set
Private set
Click Private set and then all nif genes 5 to
give Meme the set of 5 sequences.
32
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
Choose set
PSSM Meme of (
)
Private set
7120 IS895 seqs 7120 nif genes 7120 STTR7
regions all nif genes all nif genes 5 Npun
STTR7 regions
all nif genes 5
Click Private set and then all nif genes 5 to
give Meme the set of 5 sequences.
33
Build set
Display set
Modify set
Set operation
Cancel
Choose set type
Choose set
PSSM Meme of (
)
Private set
all nif genes 5
Type name of results
PSSMall nif 5
Give the results a name, press Enter, and the
task is accomplished.
34
Scenario 5
Analysis Discovery of possible regulatory
motifsSummary
  • The interface facilitates operations on sets of
    genes and sequences
  • The interface puts at your disposal powerful
    tools (that already exist), without the need
    to figure out a different computer environment
  • Taken together, these capabilities make possible
    a focus by those not particularly adept at
    computer programming on the function of
    noncoding sequences

But dont be fooled the interface does not yet
exist. Thats the point of the proposal!
Write a Comment
User Comments (0)
About PowerShow.com