BLAST - PowerPoint PPT Presentation

About This Presentation
Title:

BLAST

Description:

Basic Local Alignment Search Tool. Rapid Searching of Protein & nucleotide DBs ... Space introduced into alignment to compensate for insertions/deletions in 1 ... – PowerPoint PPT presentation

Number of Views:573
Avg rating:3.0/5.0
Slides: 19
Provided by: pv78
Learn more at: https://web.njit.edu
Category:
Tags: blast | alignment

less

Transcript and Presenter's Notes

Title: BLAST


1
BLAST A heuristic algorithm
Anjali Tiwari Pannaben Patel Pushkala Venkataraman
2
(No Transcript)
3
Basic Local Alignment Search Tool
BLAST
Rapid Searching of Protein nucleotide DBs
Seeking similar sequences
GenBank
nr
SwissProt
Database
PDB
PRF
PIR
nr non redundant database
4
Program Query Database Search Level
Blastp Amino acid Amino acid Amino acid
Blastn Nucleotide Nucleotide Nucleotide
Blastx Nucleotide Amino acid Amino acid
Tblastn Amino acid Nucleotide Amino acid
Tblastx Nucleotide Nucleotide Amino acid
BLAST 3 STEP ALGORITHM
Compile Words Scan DB Extend
5
Some definitions
Process of lining up 2 or more sequences to asses
similarity
Alignment
A 2020 substitution matrix for amino acids
BLOSUM62
Space introduced into alignment to compensate for
insertions/deletions in 1 sequence relative to
another
Gap
6
Local Search Algorithms
Similarity Measures
Identities Conservative Replacements ve
Similarity Matrix - BLOSUM
Unlikely Replacements -ve
7
General Concept of working of BLAST
1000s of sequences
Query Input
Calculate HSP
Calculate MSP
MSP Maximal Segment Pair HSP High Scoring Pair
Display output
8
Key Idea BLAST1
Compile a list of high scoring words of length w
from query (w3 for proteins, 11 for nucleic
acids)
Step 1
Scan for word hits in the database of score
greater than threshold, T
Step 2
Extend word hit in both directions to find
High Scoring Pairs with scores greater than S
Step 3
9
Example
Step -1
Query QQGPHUIQEGQQGKEEDPP Words of length 3 w
QQG, QGP, GPH, PHU, HUI Take first triple
QQG Make neighborhood words w QQG, QEG,
GQG Find high scoring triples Blosum(w, w) gt
T where T Threshold parameter Suppose Blosum
(QQG, QEG) 18 Blosum(QQG,GQG) 12 Blosum(QQG,
QQG) 16 T13 Choose QQG and QEG since Blosum
Value gt T value
10
Step -2
Suppose Database Sequence PKLMMQQGKQEGM
Matching Word Pairs in DB sequence
11
Step -3
Query QQGPHUIQEGQQGKEEDPP DB
Sequence PKLMMQQGKQEGM
Blosum(QQG, QQG) 16
QQGPHUIQEGQQGKEEDPP PKLMMQQGKQEGM
Blosum(QQGK, QQGK) 21
QQGPHUIQEGQQGKEEDPP PKLMMQQGKQEGM
Blosum(QQGKE, QQGKQ) 23
QQGPHUIQEGQQGKEEDPP PKLMMQQGKQEGM
Blosum(QQGKEE, QQGKQE) 28
QQGPHUIQEGQQGKEEDPP PKLMMQQGKQEGM
Blosum(QQGKEED, QQGKQEG) 27
12
Extension to the right stops here because BLOSUM
value is beginning to decrease
  • ADVANTAGES
  • Faster than Dynamic Programming
  • Removes low complexity regions
  • Spends less time on uninteresting
  • search
  • Statistical significance of results can
  • be obtained these are very good
  • DISADVANTAGES
  • Finds reports only local
  • alignments
  • Finds too many word hits per
  • Sequence thus reducing speed
  • Does not allow for gaps in sequence

New Models to combat disadvantages
BLAST2, PSI Blast
13
BLAST2 Combination of 2 Hit Gapped
2 Hit Method - 3 Step method Step 1 and Step 2
as BLAST 1 Step 3 is where they differ BLAST
now looks for 2 words in a sequence instead of 1
while aligning. The 2 words are at a distance lt A
and are not overlapping. Typically A40
A
14
Gapped Blast
  • Gapped alignment is introduced to get an optimal
    alignment
  • Two sequences
  • Seq A ACGTA
  • Seq B ACATA
  • Normal alignment is
  • ACGTA
  • ACATA

But if a penalty of mismatch is larger than the
penalty of gap then the best optimal alignment is
as below. AC-GTA ACG-TA ACA-TA AC-ATA
15
Gapped BLAST - Allows gaps to come while aligning
Query ATTGTCAAAGACTTGAGCTGATGCAT DB
GGCAGACATGACTGACAAGGGTATCG
ATTGTCAAAGACTTGAGCTGATGCAT

GGCAGACATGA CTGACAAGGGTATCG
Mismatch
Gap
16
PSI BLAST- Position specific iterated BLAST.
Used for multiple alignments
Query Sequence
BLAST search of DB
Sequences with high scores collected
Multiple alignment profile made
New sequences added process iterated
DB searched with profile
17
References
  • Altschul, S.F., Gish, W., Miller, W., Myers, E.W.
    Lipman, D.J. (1990) "Basic local alignment
    search tool." Journal of Molecular Biology
    215403-410.
  • Altschul, S.F.,Thomas L.M., Alejandro A.S,
    Jinghui Z, Zheng Z, W. Miller David J.L. (1997)
    Gapped BLAST and PSI-BLAST a new generation of
    protein database search programs. Nucleic Acids
    Research.
  • http//www.ncbi.nlm.nih.gov/
  • http//bioinf.man.ac.uk/ember/prototype/

18
References (Continued)
  • http//www.psc.edu/biomed/training/tutorials/seque
    nce/db/index.html
  • http//aracyc.stanford.edu/jshrager/jeff/mbcs/mat
    ch.html
  • http//www.ime.usp.br/durham/cursos/ibi5032/pub/d
    oc/allignmentTutorial.pdf
  • http//ibivu.cs.vu.nl/teaching/masters/seq_analysi
    s/sa_lecture3.pdf
Write a Comment
User Comments (0)
About PowerShow.com