Genes - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Genes

Description:

protects against degradation. Intron removal. RNA editing in some organisms ... Introduce a cut can be used for cloning. E.g. BamHI (GGATCC) and HincII (GTCGAC) ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 68
Provided by: prpe
Category:
Tags: against | cloning | genes

less

Transcript and Presenter's Notes

Title: Genes


1
Genes
  • Introduction to Bioinformatics
  • BM131/BM511
  • Gary J. Schoenhals
  • BMB
  • University of Southern Denmark

2
Definition of a gene
a segment of DNA found on a chromosome that codes
for a particular protein a unit of heredity
genes were formerly called factors
Source HyperDictionary (http//www.hyperdictionar
y.com)
3
Structure of DNA
4
Sugar backbone - Deoxyribose
5
Purines
Base pairing
Base pairing
6
Pyrimidines
Base pairing
Base pairing
in RNA
in DNA
7
Base-pairing overview
8
Base-pairing of nucleotides
3 hydrogen bond pairs
2 hydrogen bond pairs
Oxygen (red) Nitrogen (blue) Carbon (gray)
9
DNA vs. RNAstructure
10
Sugar backbone - Ribose
DNA
RNA
11
Uracil (RNA) vs. Thymine (DNA)
in RNA Uracil
in DNA Thymine
12
Second position
U
C
A
G
U
STOP
STOP
STOP
C
Third position
First position
A
G
13
Genetic code - observations
  • Redundancy most amino acids are encoded by more
    than one codon
  • Mutations in the third position are often
    silent (result in the same amino acid)
  • Mutations in the first or second position, though
    not usually silent, often result in substitution
    of an amino acid with similar characteristics
  • The genetic code is robust and resistant to
    lethal changes

14
Mutations
15
tRNA
Amino acid binding
mRNA/ribosome interaction
16
Prokaryotic
Eukaryotic
17
Gene expression overview
18
Prokaryote vs. Eukaryote
Note Localization of specific molecules can be
extremely important, especially in predictive
bioinformatics (e.g. GO database)
19
Eukaryotes
20
Eukaryotic Gene Structure
21
Gene Regulation in Eukaryotes
  • Altering rate of transcription
  • Altering RNA processing while still in nucleus
  • Altering stability of mRNA molecules (degradation
    rate) RNA interference
  • Altering translation of mRNA by ribosomes
  • Riboswitches some metabolites bind mRNA,
    affecting translation of the transcript, or cause
    enhanced transcription termination

22
RNA Processing
  • Transcription forms pre-mRNA
  • Capping with modified guanine at 5 end
  • protects against degradation
  • Intron removal
  • RNA editing in some organisms
  • Synthesis of poly-A tail (stretch of adenine
    residues)
  • Export to cytoplasm

23
Introns/ExonsOverview
24
RNA Processing
25
Alternative splicing of RNA
Different proteins from the same gene!
26
Characteristics
  • Chicken collagen 52 exons
  • Human dystrophin 79 exons
  • Average exon is 140 nucleotides long, but introns
    can be quite large (e.g. 480 kbp!)
  • Splicing done with spliceosome
  • snRNA (small nuclear RNA) molecules plus approx.
    145 proteins
  • approx. 12 different snRNA
  • Disorders retinitis pigmentosa, spinal muscular
    atrophy
  • http//www.neuro.wustl.edu/neuromuscular/pathol/sp
    liceosome.htm

27
Spliceosome
U1 binds to 5' splice site
U2 binds to branchpointrecognition sequence
U4-U5-U6 complexbinds to 5' splice site
U5 binds exon sequencesIntron is removed
U1 is displacedU6 binds to U25'-splice site
near branchpoint
http//www.neuro.wustl.edu/neuromuscular/pathol/sp
liceosome.htm
28
Spliceosome contd.
  • Exonic splicing enhancers (ESEs) and exonic
    splicing silencers (ESSs) are contained in exons
    - these help regulate how splicing is done (e.g.
    specificity)
  • Intronic splicing enhancers (ISEs) and intronic
    splicing silencers (ISSs) are contained in
    introns and function in a similar fashion to
    their exonic counterparts

29
Enhancers
30
Transcription factors modulate gene expression
Silencers are control regions of DNA which may
be far away from the gene, but when transcription
factors bind to them gene expression is repressed.
31
Insulators
T-cell receptor for antigen gamma/delta encoding
region
Stretch of DNA that separates genes from one
another shielding them from the effects of
activation or repression of neighboring genes
32
Prokaryotes
33
Prokaryotic Genes are Arranged in Operons
  • Genes arranged in operons
  • Polycistronic mRNA
  • One promoter but separate ribosome binding sites
  • Used in predictive bioinformatics

34
Lac Operon Control
cAMP low
cAMP high
cAMP high
cAMP low
CAP catabolite activator protein (binds
cAMP) cAMP cyclic adenosine monophosphate lac
repressor inactivated when bound to lactose
35
CAP is a dimer that binds DNA and is the size of
one turn of the helix
Structural bioinformatics
36
Inverted repeats important for CAP binding
37
Transcription Factor Domains
  • Only a few major types
  • Bind DNA
  • 3D conformation of binding domain recognizes DNA
    structure
  • Interact with RNA polymerase
  • Modulate transcription of genes understanding
    how to control a cells activities is an
    important part of drug discovery and development

38
DNA Recognition Domains
  • Helix-turn-helix motif
  • Zinc finger motif
  • Leucine zipper motif

39
Restriction Enzymes
  • Restriction enzymes recognize specific DNA
    sequences
  • Bind to DNA
  • Introduce a cut can be used for cloning
  • E.g. BamHI (GGATCC) and HincII (GTCGAC)

?
?
5G GATCC3 3CCTAG G5
5GTC GAC3 3CAG CTG5
overhang generated sticky end
no overhang blunt end
40
DNA Methylases
  • Enzymes that modify DNA via methylation of bases
  • Protects DNA from nucleases
  • If methylation occurs at a restriction enzyme
    site, cutting could be inhibited
  • E.g. TaqI methylase methylates TCGA

CH3
5GTCGAC3 3CAGCTG5
Enzyme HincII inhibited (GTCGAC)
CH3
41
Bioinformatics Strategies
  • Sequence alignment of DNA or proteins
  • Used to find homologs
  • Orthologs vs. paralogs
  • Similarity/identity can imply conserved function
  • Depending on the context, it may be better to use
    protein sequence rather than DNA
  • Codon usage
  • Gene prediction
  • Motif searches
  • Consensus sequences
  • Secondary structure e.g. hairpin loops
  • Presence of protein domains imparting
    functionality
  • Phylogenetic analysis

42
Alignments - Protein vs. DNA
Consider the two following DNA sequences
ATG CTT CCC TTG CAT TTT AAA Seq 1 ATG CTG CCG
CTC CAC TTC AAG Seq 2
Translation yields the following protein
sequences
Met Leu Pro Leu His Phe Lys Seq 1translated Met
Leu Pro Leu His Phe Lys Seq 2translated
Both DNAs encode identical protein sequences, but
Seq 1 shares only 14/21 bases with Seq 2 66.7
identity
43
Codon Usage
  • Use of certain codons to encode amino acids is
    non-random
  • Highly expressed genes use a restricted set of
    codons for optimal translational efficiency
  • Can be used to predict highly expressed genes
  • Atypical codon usage implies horizontal gene
    transfer
  • CodonW software can calculate Codon Adaptation
    Index (CAI), Codon Bias Index (CBI), etc.
  • Some tools here
  • http//bioweb.pasteur.fr/seqanal/dna/intro-uk.html

44
Gene Prediction
  • Prediction of open reading frames (ORFs) which
    represent the possibly expressed genes
  • Can then obtain a list of theoretical proteins
    encoded by the genome via translation
  • Some examples of tools for gene prediction
    include GlimmerHMM (eukaryotic genes) and Glimmer
    (prokaryotic genes)
  • See The Institute for Genomic Research (TIGR) on
    the web at http//www.tigr.org/

45
Motif Searches
  • Searching for patterns with biological
    significance
  • Examples include promoter sequences, enhancers,
    terminators
  • Hidden Markov models (HMMs) are quite often
    employed in these types of searches
  • Software examples ELPH (motifs), RBSfinder
    (ribosome binding sites)
  • Prosite search engine for protein families and
    domains

46
E. coli Promoter Consensus Sequences
s Factor Promoter Consensus Sequence
-35 Region
-10 Region s70 TTGACA
TATAAT s32 TCTCNCCCTTGAA CCCCATNTA
s28 CTAAA CCGATAT
-24 Region -12
Region s54 CTGGNA TTGCA
-10 region is also called Pribnow box, after its
discoverer
N any (A, T, C, or G)
E. coli has 5 different sigma factors, including
s38
47
Transcription Factor Consensus Sequences
48
Phylogenetic Analysis
  • Use of conserved sequences to aid in
    classification of organisms
  • Must choose sequences encoding molecules that
    have conserved function across species
  • Evolutionary chronometer concept
  • The difference between two sequences can be
    proportional to the evolutionary distance between
    those organisms
  • Prokaryotes 16S rRNA, eukaryotes 18S rRNA

49
Nucleotide Databases
DNA DataBank of Japan (DDBJ) Ensembl joint
between EBI and Wellcome Trust Sanger
Institute European Bioinformatics Institute
(EBI) European Molecular Biology Laboratory
(EMBL) Japan Biological Information Research
Center (JBIRC) National Center for Biotechnology
Information (NCBI) Additional Proteomics sites
or databases ExPASy Swiss-Prot Many others!
50
(No Transcript)
51
(No Transcript)
52
Page 78 in text
Page 73 in text (2nd Ed.)
53
(No Transcript)
54
Page 79 in text
Page 75 in text (2nd Ed.)
55
Cross-reference
Protein sequence
Page 81 in text
Page 77 in text (2nd Ed.)
56
DNA sequence
57
Database problems
  • Incomplete annotation
  • Missing information such as function, keywords,
    etc.
  • Consequence a given search will likely not
    return all relevant database entries
  • Redundancy
  • Smaller DNA segments often included in larger
    ones (such as chromosome)
  • ESTs (Expressed sequence tags)
  • A database is only as good as the person(s)
    maintaining it garbage in garbage out!

58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
(No Transcript)
64
Scroll to bottom to see what is on next slide.
65
(No Transcript)
66
Bit score (BLAST)
Colored bars represent matching portion of gene
color indicates functional group
67
BLAST results
Write a Comment
User Comments (0)
About PowerShow.com