Eric C' Rouchka, D'Sc' - PowerPoint PPT Presentation

1 / 92

About This Presentation

Title:

Eric C' Rouchka, D'Sc'

Description:

CECS 694-02 Introduction to Bioinformatics University ... Image Source: http://www.amazon.com ... http://smartmoney.com/consumer/index.cfm?story=working-june02 ... – PowerPoint PPT presentation

Number of Views:105

Avg rating:3.0/5.0

Slides: 93

Provided by: drer2

Category:

more less

Transcript and Presenter's Notes

Title: Eric C' Rouchka, D'Sc'

1

Eric C. Rouchka, D.Sc.
Vogt Building, Room 205
(502) 852-0467
eric.rouchka_at_uofl.edu
http//kbrin.a-bldg.louisville.edu/CECS694/

2
Course Overview

Syllabus
Structure of Class
Expectations

3
Contact Information

INSTRUCTOR
Dr. Eric Rouchka
Phone 852-0467 or 852-3835
Email eric.rouchka_at_louisville.edu
OFFICE HOURS
Vogt Building
Room 205
T, Th 130 300 pm
or by appointment
http//kbrin.a-bldg.louisville.edu/CECS694/

4
Required Texts

Bioinformatics Sequence and Genome Analysis.
David Mount. 2001. ISBN 9-87969-608-7.
Biological Sequence Analysis Probabilistic
models of proteins and nucleic acids. R. Durbin,
S. Eddy, A. Krogh and G. Mitchison. 1998. ISBN
0-521-62971-3.
In addition, a number of journal articles will be
handed out in class.

5
Required Texts

Image Source http//www.amazon.com/
6
Other Bioinformatics Books
Image Source http//www.amazon.com/
7
Other Reference Books
Image Source http//www.amazon.com/
8
Tentative Schedule of Topics

Overview of molecular biology
Pairwise sequence alignment
Multiple sequence alignment
Sequence Databases
Database searching
Construction of phylogenetic trees
RNA secondary structure prediction Microarray
image analysis
Sequence assembly techniques
Gene Prediction
Protein Folding Prediction

9
Course Assignments

4-5 written homework assignments, 3-4 programming
assignments, a midterm test, and a final project,
and bioinformatics seminars.
Homework assignments must be turned in at the
beginning of class on the date they are due.
Late homework assignments will not be accepted,
since the solutions will be posted to the course
website.
Programming assignments are due at the beginning
of class on the date they are due. The programs
may be written in the language of your choice.
Late programming assignments will be accepted,
with a 10 per day deduction for a maximum of two
days.
Reading assignments from the two selected texts
and journal articles will be assigned.

10
Grading

Programming Projects (3-4) 25 of final grade
Homework (4-5) 15 of final grade
Midterm Test 25 of final grade
Final Project 25 of final grade
One page seminar reports (3) 10 of final grade
Final grades will be given using a plus/minus
scale. The cutoffs for grades will be roughly as
follows
90-100 A
80-89 B
70-79 C
60-69 D
0-59 F

11
Class Structure

Introduction of a Topic
Description of algorithms
Available tools
Make sure to ask questions!

12
What is Bioinformatics/ Computational Biology?

Bioinformatics collection and storage of
biological information
Computational biology development of algorithms
and statistical models to analyze biological data
Bioinformatics/Computational Biology will be
interchanged

13
What is Bioinformatics?
Source http//ccb.wustl.edu/
14
Why should I care?

SmartMoney ranks Bioinformatics as 1 among next
HotJobs
Business Week 50 Masters of Innovation
Jobs available, exciting research potential
Important information waiting to be decoded!

http//smartmoney.com/consumer/index.cfm?storywor
king-june02
15
Why is bioinformatics hot?

Supply/demand few people adequately trained in
both biology and computer science
Genome sequencing, microarrays, etc lead to large
amounts of data to be analyzed
Leads to important discoveries
Saves time and money

16
What skills are needed?

Well-grounded in one of the following areas
Computer science
Molecular biology
Statistics
Working knowledge and appreciation in the others!

17
Where Can I Learn More?

ISCB http//www.iscb.org/
NBCI http//ncbi.nlm.nih.gov/
http//www.bioinformatics.org/
Journals
Conferences (ISMB, RECOMB, PSB)

18
Overview of Molecular Biology

Cells
Chromosomes
DNA
RNA
Amino Acids
Proteins
Genome/Transcriptome/Proteome

19
Cells

Complex system enclosed in a membrane
Organisms are unicellular (bacteria, bakers
yeast) or multicellular
Humans
60 trillion cells
320 cell types

Example Animal Cell www.ebi.ac.uk/microarray/
biology_intro.htm
20
Organisms

Classified into two types
Eukaryotes contain a membrane-bound nucleus and
organelles (plants, animals, fungi,)
Prokaryotes lack a true membrane-bound nucleus
and organelles (single-celled, includes bacteria)
Not all single celled organisms are prokaryotes!

21
Chromosomes

In eukaryotes, nucleus contains one or several
double stranded DNA molecules organized as
chromosomes
Humans
22 Pairs of autosomes
1 pair sex chromosomes

Human Karyotype http//avery.rutgers.edu/WSSP/Stu
dentScholars/ Session8/Session8.html
22
Image source www.biotec.or.th/Genome/whatGenome.h
tml
23
What is DNA?

DNA Deoxyribonucleic Acid
Single stranded molecule (oligomer,
polynucleotide) chain of nucleotides
4 different nucleotides
Adenosine (A)
Cytosine (C)
Guanine (G)
Thymine (T)

24
Nucleotide Bases

Purines (A and G)
Pyrimidines (C and T)
Difference is in base structure

Image Source www.ebi.ac.uk/microarray/
biology_intro.htm
25
DNA

Can be thought of as an alphabet with 4
characters
4 letter alphabet with sufficiently long words
contains information to create complex organisms
Not unlike a computer with a small alphabet

26
DNA polynucleotides(oligomers)

Different nucleotides are strung together to form
polynucleotides
Ends of the polynucleotide are different
A directionality is present
Convention is to label the coding strand from 5
to 3

http//www.emc.maricopa.edu/faculty/farabee/BIOBK/
BioBookDNAMOLGEN.html
27
Single Strand Polynucleotide

Example polynucleotide
5 G?T?A?A?A?G?T?C?C?C?G?T?T?A?G?C 3

28
Double Stranded DNA

DNA can be single-stranded or double-stranded
Double stranded DNA second strand is the
reverse complement strand
Reverse complement runs in opposite direction and
bases are complementary
Complementary bases
A, T
C, G

29
Double Stranded Sequence

Example double stranded polynucleotide
5 G?T?A?A?A?G?T?C?C?C?G?T?T?A?G?C 3
3 C?A?T?T?T?C?A?G?G?G?C?A?A?T?C?G 5

http//www.emc.maricopa.edu/faculty/farabee/BIOBK/
BioBookDNAMOLGEN.html
30
Double Stranded DNA
Source unknown
31
Double Helix

Two complementary DNA strands form a stable DNA
double helix
Spring 2003 marked the 50th anniversary of its
discovery

Image source www.ebi.ac.uk/microarray/
biology_intro.htm
32
RNA

Ribonucleic Acid
Similar to DNA
Thymine (T) is replaced by uracil (U)
RNA can be
Single stranded
Double stranded
Hybridized with DNA

33
RNA

RNA is generally single stranded
Forms secondary or tertiary structures
RNA folding will be discussed later
Important in a variety of ways, including protein
synthesis

34
RNA secondary structure

E. coli Rnase P RNA secondary structure

Image source www.mbio.ncsu.edu/JWB/MB409/lecture/
lecture05/lecture05.htm
35
mRNA

Messenger RNA
Linear molecule encoding genetic information
copied from DNA molecules
Transcription process in which DNA is copied
into an RNA molecule

36
mRNA processing

Eukaryotic genes can be pieced together
Exons coding regions
Introns non-coding regions
mRNA processing removes introns, splices exons
together
Processed mRNA can be translated into a protein
sequence

37
mRNA Processing
Image source http//departments.oxy.edu/biology/S
tillman/bi221/111300/processing_of_hnrnas.htm
38
ESTs

Expressed Sequence Tags
Basically sequence of processed mRNA

39
tRNA

Transfer RNA
Well-defined three-dimensional structure
Critical for creation of proteins

40
tRNA structure
Source http//www.tulane.edu/biochem/nolan/lectu
res/rna/frames/trnabtx2.htm
41
tRNA

Amino acid attached to each tRNA
Determined by 3 base anticodon sequence
(complementary to mRNA)
Translation process in which the nucleotide
sequence of the processed mRNA is used in order
to join amino acids together into a protein with
the help of ribosomes and tRNA

42
Genetic Code

4 possible bases (A, C, G, U)
3 bases in the codon
4 4 4 64 possible codon sequences
Start codon AUG
Stop codons UAA, UAG, UGA
61 codons to code for amino acids (AUG as well)
20 amino acids redundancy in genetic code

43
20 Amino Acids

Glycine (G, GLY)
Alanine (A, ALA)
Valine (V, VAL)
Leucine (L, LEU)
Isoleucine (I, ILE)
Phenylalanine (F, PHE)
Proline (P, PRO)
Serine (S, SER)
Threonine (T, THR)
Cysteine (C, CYS)
Methionine (M, MET)
Tryptophan (W, TRP)
Tyrosine (T, TYR)
Asparagine (N, ASN)
Glutamine (Q, GLN)
Aspartic acid (D, ASP)
Glutamic Acid (E, GLU)
Lysine (K, LYS)
Arginine (R, ARG)

44
Amino Acids

building blocks for proteins (20 different)
vary by side chain groups
Hydrophilic amino acids are water soluable
Hydrophobic are not
Linked via a single chemical bond (peptide bond)
Peptide Short linear chain of amino acids (lt 30)
polypeptide long chain of amino acids (which
can be upwards of 4000 residues long).

45
Proteins

Polypeptides having a three dimensional
structure.
Primarysequence of amino acids constituting the
polypeptide chain
Secondarylocal organization into secondary
structures such as ? helices and ? sheets
Tertiary three dimensional arrangements of the
amino acids as they react to one another due to
the polarity and resulting interactions between
their side chains
Quaternarynumber and relative positions of the
protein subunits

46
Protein Structure
Image source www.ebi.ac.uk/microarray/biology_int
ro.html
47
Central Dogma

DNA
?
RNA
?
PROTEIN

Image source unknown
48
Central Dogma
49
What is a Gene?

the physical and functional unit of heredity that
carries information from one generation to the
next
DNA sequence necessary for the synthesis of a
functional protein or RNA molecule

50
Genome

chromosomal DNA of an organism
number of chromosomes and genome size varies
quite significantly from one organism to another
Genome size and number of genes does not
necessarily determine organism complexity

51
Genome Comparison
52
Transcriptome

complete collection of all possible mRNAs
(including splice variants) of an organism.
regions of an organisms genome that get
transcribed into messenger RNA.
transcriptome can be extended to include all
transcribed elements, including non-coding RNAs
used for structural and regulatory purposes.

53
Proteome

the complete collection of proteins that can be
produced by an organism.
can be studied either as static (sum of all
proteins possible) or dynamic (all proteins found
at a specific time point) entity

54
Brief History of Sequencing