Title: BCB 444544
1BCB 444/544
Lecture 4 Biology Background Biological
Databases 4 Sep 3
- Thanks to Drena Dobbs (ISU for many borrowed
modified PPTs
2Required Reading (before lecture)
- Thurs Sep 4 - Lab 2
- Databases, ISU Resources, Pairwise Sequence
Alignment - Fri Sep 5 - for Lecture 5
- Pairwise Sequence Alignment
- Chp 3 - pp 31-41 Xiong Textbook
3Last Friday in a nutshell
- Main molecules we will deal with are DNA, RNA,
and proteins - DNA carries information, made up of 4
nucleotides - RNA involved in all aspects of gene expression,
also made up of 4 nucleotides - Proteins perform most functions and form most
structures, made up of 20 amino acids - Genes have a complex structure
4RNA and DNA Structures
5Protein Structure
6Sequence-Structure-Function
- Amino acid sequence determines protein structure
- Protein structure determines function
7The Central Dogma
Gene expression the whole process of going from
DNA to RNA to Protein
8Gene structure
- Genes are fragmented, containing
non-protein-coding introns between the functional
exons
9Gene splicing
Introns are removed before mRNA leaves the nucleus
DNA
Transcribed RNA
Introns removed by splicing
mRNA
10Regulation of gene expression
- Genes are regulated transcriptionally by proteins
that interact with DNA elements around the gene - DNA level
- Promoters
- Enhancers and repressors
- Chromatin level (X-inactivation)
- Genes are also regulated
- Post-transcriptionally
- Post-translationally
11Transcription factor binding sites
- Promotors, enhancers, and repressors are all
binding sites for transcription factors (proteins
that bind DNA and affect transcription)
12Mutation
- DNA mutation is any change in DNA sequence
13Role of mutation in evolution
Mutation is not all bad!
- Occurs at a high rate in all our cells and does
NOT always have negative effects
14Beneficial mutations
Changes in DNA sequence can be beneficial
15Eukaryotic Cell
Lots of compartments called organelles
16Prokaryotic Cell
No separate compartments
17Prokaryotes vs. Eukaryotes
- All of our gene examples so far have been showing
eukaryotic genes - Prokaryotic genes are not as complex
18Prokaryotes vs. Eukaryotes
Eukaryotic Gene
Prokaryotic Gene
Important difference no introns!
19Protein localization
20Types of RNA
- http//en.wikipedia.org/wiki/List_of_RNAs
- I counted 25 different types of RNAs on listed on
this page
21Web resource
- Online textbooks NCBI bookshelf
222- Biological Databases
- Xiong Chp 2
- 2 Introduction to Biological Databases
- What Is a Database?
- Types of Databases
- Biological Databases
- Pitfalls of Biological Databases
- Information Retrieval from Biological
Databases - Summary
- Further Reading
23Types of Databases
- 3 Major types of electronic databases
- 1- Flat files - simple text files
- no organization to facilitate retrieval
- 2- Relational - data organized as tables
("relations") - shared features among tables allows rapid search
- 3- Object-oriented - data organized as "objects"
- objects associated hierarchically
-
24Biological Databases
Currently - all 3 types, but MANY flat
files What are goals of biological
databases? 1- Information retrieval 2-
Knowledge discovery
Important issue Interconnectivity
25Types of Biological Databases
- 1- Primary
- "simple" archives of sequences, structures,
images, etc. - raw data, minimal annotations, not always well
curated! - 2- Secondary
- enhanced with more complete annotation of
sequences, structures, images, etc. - usually curated!
- 3- Specialized
- focused on a particular research interest or
organism - usually - not always - highly curated
-
26Examples of Biological Databases
- 1- Primary
- DNA sequences
- GenBank - US
- European Molecular Biology Lab - EMBL
- DNA Data Bank of Japan - DDBI
- Structures (Protein, DNA, RNA)
- PDB - Protein Data Bank
- NDB - Nucleic Acid Databank
-
27Examples of Biological Databases
- 2- Secondary
- Protein sequences
- Swiss-Prot, TreEMBL, PIR
- these recently combined into UniProt
- 3- Specialized
- Species-specific (or "taxonomic" specific)
- Flybase, WormBase, AceDB, PlantDB
- Molecule-specific,disease-specific
28 Pitfalls of Biological Databases
- Errors!
-
- Lack of documentation about quality or
reliability of data - Limited mechanisms for "data checking" or
preventing propagation of errors (esp.
annotation errors!!) - Redundancy
- Inconsistency
- Incompatibility (format, terminology, data
types, etc.) -
-
29Information Retrieval from Biological Databases
- 2 most popular retrieval systems
- ENTREZ - NCBI
- will use a LOT - Introduced in Lab 2
- SRS - Sequence Retrieval Systems - EBI
- will use less, similar to ENTREZ
- Both
- Provide access to multiple databases
- Allow complex queries
-
30WebCT Question
- Do you know any program or database to find or
predict the small non-coding RNAs in the
completely sequenced genome? Specifically, in the
bacterial genomes? - Noncoding RNA database
- http//biobases.ibch.poznan.pl/ncRNA/
31SUMMARY 2- Biological Databases
BEWARE!