Exploring 3D Molecular Structures Using NCBI Tools - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Exploring 3D Molecular Structures Using NCBI Tools

Description:

VAST: Calculate (rik, zik) 3. 1. z. For both the query and. target structures, For each SSE k, ... Then calculate rik and. zik for the endpoints of. SSEs i k. ... – PowerPoint PPT presentation

Number of Views:276
Avg rating:3.0/5.0
Slides: 34
Provided by: ericwhitn
Category:

less

Transcript and Presenter's Notes

Title: Exploring 3D Molecular Structures Using NCBI Tools


1
Exploring 3D Molecular Structures Using NCBI Tools
Lecture 2 Alignments in Cn3D
June 10, 2008
2
The Cn3D Alignment Model
  • Each sequence is aligned pairwise to the master
    PDB sequence
  • Aligned blocks represent secondary structure
    elements
  • Aligned blocks have no internal gaps
  • Aligned sequences have a residue at each column
    in the block
  • Residues in the same column occupy the same
    position in space

Block 1
Block 2
Block 3
3
What Cn3D Can Do
  • Render, rotate, and annotate multiple structure
    models
  • Create and edit multiple sequence alignments
  • Import and align sequences and structures based
    on an existing alignment

What Cn3D Cant Do
  • Alter the structural coordinate data
  • Create a theoretical structure or run MD
    simulations
  • Read or write PDB files directly

4
Finding Structures by Homology
  • Use simple sequence homology (BLASTp)
  • Finds pairwise alignment based on sequence
    similarity
  • Use sequence and functional homology (CD-Search)
  • Finds multiple sequence alignment based on
    sequence similarity
  • Use structural homology (VAST)
  • Finds multiple sequence alignment based on
    structural similarity
  • Use sequence, structure and function (Curated
    CDs)
  • Finds multiple sequence alignment based on
    sequence and structural similarity

5
VAST Creating Structural Alignments
  • Why search for similar structures?
  • To superimpose structures
  • To find homologs that sequence searches cannot
    distant protein homologs often conserve structure
    more strongly than sequence
  • To explore protein evolution similar protein
    folds can be used to support different functions
  • To identify conserved core elements of a protein
    fold that can be used to model related proteins
    of unknown structure

6
VAST Structural Neighbors
Vector Alignment Search Tool
4
For each 3D domain,
2
locate SSEs (secondary structure elements),
5
6
and represent them as individual vectors.
1
3
Human IL-4
7
VAST Calculate fij
Vector position about the z axis
For both the query and target structures, Calcula
te the midpoint of each SSE. For each SSE
k, align k along z and project midpoints onto the
xy plane. Then calculate fijk for i ? k, j ? k.
z
4
2
5
6
1
3
4
2
1
5
f14
6
8
VAST Calculate (rik, zik)
Vector position relative to the xy plane
z
For both the query and target structures, For
each SSE k, set the origin at the midpoint of k.
Then calculate rik and zik for the endpoints
of SSEs i ? k.
3
r13
1
z13
xy
9
VAST Create Comparison Graph
N
C
1 2 3 4 5 6
4
N
1 2 3 4 5
2
Nodes r13ltgtr12 z13ltgtz12
IL-4
5
1
3
6
3
4
IL-6
Arcs f16ltgtf15 must follow sequence order
1
5
2
C
Select path with highest weights
10
VAST Refinement
Aligned residues are red
Ca atoms are added to the aligned SSEs
Alignments are allowed to extend beyond SSE
boundaries
All atoms are added to the models, and the
detailed backbone and sidechain positions are
refined
Alignment extended to the end of this strand
11
VAST Alignment of Sequence
  • Aligned blocks represent structural core
    elements
  • Aligned blocks have no internal gaps
  • Aligned residues occupy the same position in
    space
  • Aligned residues are shown in CAPITAL letters

Helix 1
Helix 2
Helix 3
Helix 4
12
VAST Summary
  • Secondary structure elements are represented as
    vectors
  • and are aligned based on their relative
    orientations
  • VAST ignores loops and tolerates variation in
    SSE length
  • The initial alignment is wholly ignorant of
    atomic coordinates
  • Pathways through aligned SSEs respect sequence
    order
  • VAST is sensitive to topology

C
C
N
N
N
C
  • Alignments are extended and optimized using
    all-atom models
  • Aligned blocks may extend across or into loops
    or other SSEs

13
VAST Scoring
p d P(s gt s0, n) c(n, P1, P2)
The probability that the VAST alignment occurred
by chance.
d
Number of structures searched (set to 500)
P(s gt s0, n)
Probability of observing an alignment of n SSEs
with a score greater than s0 by chance.
Search space Number of possible alignments of n
SSEs between vector sets P1 and P2.
c(n, P1, P2)
14
Accessing VAST Neighbors
links to VAST neighbors
15
VAST Neighbor View
links to structure-based sequence alignments
16
Query by Chain vs 3D Domain
c(n, P1, P2) is smaller for a 3D domain!
Query by whole chain
Not found using whole chain query!
Query by domain 5
17
VAST Multiple Alignments
Cn3D
18
Entrez Links to VAST Neighbors
Limiting VAST results by an Entrez query in 3D
Domains
3 AND humanorgn AND 4helixcount AND
0domainno
173 VAST neighbors
19
Submitting a PDB File to VAST
  • Redesigned interface!
  • This is the best way to convert PDB into MMDB
    format!

20
Structure Function
  • VAST finds proteins that have similar 3D folds
  • CD-Search finds proteins that have similar
    sequences and similar functions
  • Curated CDs VAST CD-Search
  • Proteins that have similar 3D folds,
  • similar sequences and similar functions

21
Curating CDs with VAST
Cn3D
Cn3D
smart00235
VAST
cd00203
22

CD-Curation Effect on model alignment accuracy
A. Marchler-Bauer
23
cd00659 A Curated CD
Functional features
parent CD
child CD
cd00659
24
CD Family Values
Residues aligned in the parent must be aligned in
the child
Parent cd00397, C-term catalytic domain of DNA
breaking-rejoining enzymes
164 columns
218 columns
Child cd00659, C-term catalytic domain of DNA
Topo IB
25
Curated CD Summary
Cn3D
catalytic residue
Aligned query
catalytic residues
26
A Path to a Structural Template
  • Look for a curated CD
  • CD-Search Youre done if you find one otherwise
    continue.
  • Construct a structural alignment
  • BLASTp (Related Structure) Find the most
    sequence-similar structure to the query
  • VAST Find the structural neighbors to the most
    sequence similar structure
  • Cn3D Import and align the sequence to the VAST
    alignment using algorithms in Cn3D

27
Importing into Cn3D
Master
Sequences are initially unaligned, with red
regions indicating blocks
Import
28
Cn3D Alignment Algorithms
1
2
3
29
Identifying Alignment Problems
Block errors Indicated by red shading These
result when the extent of an aligned block in the
import window differs from that in the template
block error
geometry violation
Fix these problems by adjusting the block lengths!
Geometry violations Indicated by green
shading These result when a loop between aligned
blocks in the import window is too short to span
the distance between the block ends based on the
master structure in the template
30
Trial with NP_001058
Human topoisomerase IIa
pfam02518
curated CDs
31
Step 1 Related Structures
pfam02518
1ZXM the most sequence-similar structure
32
Step 2 VAST Neighbors of 1ZXM
Cn3D
?
?
?
pfam02518
33
For more information
  • Course web pages
  • info_at_ncbi.nlm.nih.gov
  • NCBI Handbook, Ch. 3
  • NCBI Bookshelf Bioinformatics in Tropical
    Disease Research
Write a Comment
User Comments (0)
About PowerShow.com