Title: Algorithms for Biological Sequence Analysis
1Algorithms for Biological Sequence Analysis
- Kun-Mao Chao (???)
- Department of Computer Science and Information
Engineering - National Taiwan University, Taiwan
- Date September 19, 2006
- E-mail kmchao_at_csie.ntu.edu.tw
- WWW http//www.csie.ntu.edu.tw/kmchao
2About this course
- Course Algorithms for biological sequence
analysis - We will be focused on the sequence-related
algorithmic problems. Genomic sequences are our
main target. - The oldest language
- The largest program
- Fall semester, 2006
- Tuesday 1020 1310, 111 CSIE Building.
- 3 credits
- Web site http//www.csie.ntu.edu.tw/kmchao/seq06
fall
3Coursework
- Homework assignments and Class participation
(15) - Two midterm exams (30 each)
- November 7, 2006 (tentatively)
- December 19, 2006 (tentatively)
- Oral presentation of selected papers (25)
4Outlines
- Part I Sequence Homology
- Introduction to genomes
- Dynamic programming strategy revisited
- Pairwise sequence alignment
- Multiple sequence alignment
- Chaining algorithms for genomic sequence analysis
- Suboptimal alignment
- Comparative genomics
- Hidden Markov models (the Viterbi algorithm et
al.) - Part II Sequence Composition
- Maximum-sum and maximum-density segments
- SNP and haplotype data analysis
- Genome annotation
- Other advanced topics
5A Brief History of Genetics
- 1859 Darwin publishes The Origin of Species
- 1865 Genes are particular factors
- 1871 Discovery of nucleic acid
- 1903 Chromosomes are hereditary units
- 1910 Genes lie on chromosomes
- 1913 Chromosomes are linear arrays of genes
- 1931 Recombination occurs by crossing over
6A Brief History of Genetics (contd)
- 1944 DNA is the genetic material
- 1945 A gene codes for protein
- 1951 First protein sequence
- 1953 DNA is a double helix
- 1961 Genetic code is triplet
- 1977 Eukaryotic genes are interrupted
- 1977 DNA can be sequenced
- 21th Century Many genomes completely sequenced
7Milestones of Bioinformatics
- 1962 Pauling's theory of molecular evolution
- 1965 Margaret Dayhoff's Atlas of Protein
Sequences - 1970 Needleman-Wunsch algorithm
- 1977 DNA sequencing and software to analyze it
(Staden) - 1981 Smith-Waterman algorithm developed
- 1981 The concept of a sequence motif (Doolittle)
- 1982 GenBank Release 3 made public
- 1982 Phage lambda genome sequenced
8Milestones of Bioinformatics (contd)
- 1983 Sequence database searching algorithm
(Wilbur-Lipman) - 1985 FASTP/FASTN fast sequence similarity
searching - 1988 National Center for Biotechnology
Information (NCBI) created at NIH/NLM - 1988 EMBnet network for database distribution
- 1990 BLAST fast sequence similarity searching
- 1991 EST expressed sequence tag sequencing
- 1993 Sanger Centre, Hinxton, UK
- 1994 EMBL European Bioinformatics Institute,
Hinxton, UK
9Milestones of Bioinformatics (contd)
- 1995 First bacterial genomes completely sequenced
- 1996 Yeast genome completely sequenced
- 1997 PSI-BLAST
- 1998 Worm (multicellular) genome completely
sequenced - 1999 Fly genome completely sequenced
10Milestones of Bioinformatics (contd)
- Human Genome Project (1990-2003)
- Mouse 2002
- Rat 2004
- Chimpanzee 2005
- Completed Genomes
11Chimpanzee Genome
12The Primate Family Tree
Source Nature
13Source My nieces email
14Source My nieces email
15Source My nieces email
16Count every " F" in the following text
- FINISHED FILES ARE THE RE SULT OF YEARS OF
SCIENTI FIC STUDY COMBINED WITH THE EXPERIENCE
OF YEARS...
Source My nieces email
17Olny srmat poelpe can raed tihs.
- cdnuolt blveiee taht I cluod aulaclty  uesdnatnrd
waht I was rdanieg. The phaonmneal pweor of the
hmuan mnid, aoccdrnig  to a rscheearch at
Cmabrigde Uinervtisy, it deosn't mttaer in waht
oredr the  ltteers in a wrod are, the olny
iprmoatnt tihng is taht the frist and lsat
 ltteer be in the rghit pclae. The rset can be a
taotl mses and you can sitll  raed it wouthit a
porbelm.
Source My nieces email
18Discovery is to see what everyone else has seen,
but think what no one else has thought. Albert
Szent-Györgyi(The Nobel Prize in Physiology or
Medicine, 1937 )
By inventing elegant software tools, we can help
biologists see and think. Invention ?
Discovery Kun-Mao Chao
19Source My nieces email