Title: Introduction to Genetics
1Introduction to Genetics as relevant to this
course
- (Ack Roche Genetics CD-ROM, Mishras notes at
NYU, )
2Background (1/18)
- Genome, Chromosome, Genes made up of DNAs
- Genetics research (largely over last 100yrs,
accelerated in last 30 yrs) - Has led to important advances in medical science.
- Nucleus of a cell contains chromosomes (made
up of DNA) and proteins. - DNA (Deoxy Ribo Nucleic acid)
- Is the genetic material that is inherited.
- Contains the information needed by living cells
to specify their structure, function, activity
and interaction with other cells and environment. - A DNA molecule can be thought of as a very long
sequence of nucleotides or bases.
3DNA structure (2/18)
- The Nobel Prize in Physiology or Medicine 1962 --
Crick, Watson and Wilkins - for their discoveries concerning the molecular
structure of nucleic acids and its significance
for information transfer in living material - Made up of 4 different building blocks (so called
nucleotide bases), each an almost planar
nitrogenic organic compound - Adenine (A)
- Thymine (T)
- Guanine (G)
- Cytosine (C)
- Base pairs (A -- T, C -- G)
4DNA Structure cont. (3/18)
- Base pairs (A -- T,C -- G) are attached to a
sugar phosphate backbone to form one of 2
strands of a DNA molecule. - Phosphate ((PO4) -3)
- Deoxyribose
- Two strands are bonded together by the base pairs
(A T, C G). - Results in mirror image or complementary strands,
each is twisted (or helical), and when bonded
they form a double helix. - Direction of each strand (5 meaning beginning or
3 meaning end of the strand) - 5 and 3 refer to position of bases in relation
to the sugar molecule in the DNA backbone. - Are important reference points to navigate the
genome. - 2 complementary strands are oriented in opposite
direction to each other.
5DNA Structure
6Genome Size
Species Genome Size (in base pairs) No. of Chromosomes
E. Coli 4.64 X 106 1
S. Cerevisae (yeast) 1.205X107 16
C. Elegans (nematode) 108 11/12
D. Melanogaster 1.7X108 4
M. Musculus 3 X 109 20
H. Sapiens 3 X 109 6 feet when completely stretched out 23
A. Cepa (onion) 1.5X1010 8
7DNA hybridization (DL 3/18)
- Hybridization between complementary DNA sequences
to form a double stranded DNA molecule. - One of the most important DNA technology
- Applications of Hybridization
- PCR (Polymerase Chain Reaction)
- Enzymetically generating millions of copies of a
tiny amount of a particular nucleic acid
sequence. - Northern blots analysis
- Possible to study (in a semi-qualitative manner)
the level of transcription of a particular gene. - DNA Microarrays
- Can interrogate the level of transcription of
several thousand of different genes in one sample
in one experiment.
8PCR (Polymerase Chain Reaction) (DL 3/18)
- PCR allows selected amplification of a DNA
sequence. - Only a tiny amount of DNA is necessary to obtain
a PCR product (a drop of blood or less is
enough). - Complementary DNA primers need to be designed.
- For this the DNA sequence flanking the target
sequence needs to be known in advance. - Primers are short synthetic DNA sequences of
about 20 bases (so called oligonucleotides) that
can specifically hybridize to a unique
complementary DNA sequence. - The approach
- Genomic DNA (the template), Primers (the
starters), deoxynucleotides (building blocks), a
special DNA polymerase that is resistant to heat
(the motor of the reaction) are mixed together in
one reaction tube. - Reaction takes place in a thermocycler (an
apparatus that allows one to precisely heat and
cool the reaction). - DNA is heated to almost boiling temperature which
separates the 2 strands (whole process is called
heat denaturation) - Cooling of the mixture allows the primer to bind
to their complementary sequence of the genomic
DNA. - Once the primers bind the DNA polymerase uses
them as the start site to generate a copy of each
strand of the targeted gene fragment building 2
new double stranded molecules. - Doing it (denaturation followed by cooling) 30
times, results in 230 109 (1 billion) copies.
9Northern Blot analysis
- The complete RNA content from a sample is
separated according to size by electrophoresis. - Usually done in a sheet of agarose (similar to
gelatine) - In response to electric current, larger molecules
move slower, and smaller move faster, thus
separating different RNA molecules by size. - Then RNAs are transferred from gel to a filter
membrane (blot) - Blot is then exposed to a solution containing a
nucleic acid (probe) complement to the sequence
whose presence in the blot one wants to
interrogate. - The probe may be cRNA or cDNA with detectable
marking (radioactive isotope or a fluoroscent
tag) - If the targeted sequence is present in the blot
then the probe hybridizes and sticks to the blot
at the location where the targeted sequence is
located. - After washing off of excess probe -- a signal is
detectable and its specificity can be checked
based on the expected size of the RNA that will
correlate with how far it has migrated during
electrophoresis. - With this method It is possible to study in a
semiqualitative manner the level of transcription
of a particular gene. - Comparison of the results from different samples
(e.g. different organs etc.) provides information
about the transcriptional regulation of the gene.
10DNA Structure cont. (4/18)
- The order of nucleotide bases along a DNA strand
is known as the sequence. - The genetic information is encoded in the precise
order of the base pairs. - DL
- GenBank database http//www.ncbi.nlm.nih.gov/Entre
z/ - Human genome project http//www.genome.gov/page.cf
m?pageID10001694 - DNA sequencing
- Is the process designed to precisely determine
the sequence of bases in the DNA.
11Cells, DNA and genome (5,6,8/18)
- During cell division (Mitosis) the entire DNA of
the cell is copied - 2 strands separate, complementary strands are
generated. - Two duplicate DNA sequences are produced.
- Genome an organisms total DNA content
- Diploid cells cells that carry 2 genome copies
- Haploid cells have a single copy of the genome
- Reproductive germs cells (gametes), i..e., egg
sperm cells - Human genome consists of
- 22 autosomal chromosomes (same in males and
females) - 2 sex chromosomes X and Y (males XY, females XX)
12Structure of Chromosomes (7/18)
- Center is called centromere.
- Two ends called Telomere.
- Center separates two arms
- Short arm p
- Long arm q.
13Structure of genes 9-11/18
- Genes are those parts of the genome that contain
the information necessary for the building of
proteins. (size100-several million base pairs) - Exon (coding sequence), Intron (non-coding
sequence), regulatory region (at the two ends
for regulating how actively protein is to be
synthesized from them) - Eukaryotes (organisms whose cell have nucleus)
have genes segmented into exons and introns - Introns can occur between individual codons or
within a single codon. - Promoter (a regulatory element in the 5 end)
- Consists of several short sequences which are
consensus binding sites for a number of proteins
called transcription factors.
14 (DL 10/18)
- Prokaryotes (do not have nucleus) genes are not
segmented to exons and introns. - Eukaryotes (normally segemented to exons and
introns) - Except mitochondrial genes a few nuclear genes.
- During gene expression exons and introns are
transcribed to form a pre-mRNA - RNA splicing -- removes introns and exons and
produces mature mRNA molecule that codes for a
polypeptide. - Exons sequences that are represented in the
mature mRNA - May or may not code for a protein
- Eg. Exons at the 3 or 5 end of mRNA may not be
translated to proteins
15Some Genes (from Mishras slides)
Gene Product Organism Exon Length Introns Intron Length
Adenoshine deaminase Human 1500 11 30,000
Apolipoprotein B Human 14,000 28 29,000
Erythropoietin Human 582 4 1562
Thyroglobulin Human 8500 40 100,000
a-interferon Human 600 0 0
Fibroin Silk Worm 18,000 1 970
Phaseolin French Bean 1263 5 515
16Some human gene locations (From Mishras slides)
Genes chromosome
Insulin 11
Galactokinase 11
Viral oncogene homologues
C-sis 22
C-mos 8
C-Ha-Ras-1 11
C-myb 6
Interferons
a b luster 9
g 12
Genes chromosome
a-globin cluster 16
b-globin cluster 11
Immunoglobulin
k (light chain) 2
l (light chain) 22
Heavy Chain 14
Pseudogenes 9,32,15,18
Growth Hormone gene cluster 17
Thymidine kinase 17
17Gene expresion (12/18)
- Gene expression (Transcription and Translation)
from genes to making proteins the 2 step process - Transcription genetic information in DNA is
copied into messenger RNA (mRNA) - Translation mRNA is used as a template to
synthesize a protein.
18Central Dogma
- Due to Francis Crick 1958 states that these
information flows are all unidirectional - The central dogma states that once information'
has passed into protein it cannot get out again.
The transfer of information from nucleic acid to
nucleic acid, or from nucleic acid to protein,
may be possible, but transfer from protein to
protein, or from protein to nucleic acid is
impossible. Information means here the precise
determination of sequence, either of bases in the
nucleic acid or of amino acid residues in the
protein.
19Transcription (13/18)
- RNA (Ribonucleic acid)
- Similar to DNA (except for a chemical
modification of the sugar backbone) - Instead of T contains U (Uracil) which binds with
A. - Is not double stranded but single stranded
- RNA molecules tend to fold back on themselves to
make helical twisted and rigid segments. - RNA is synthesized
- By unwinding the DNA double helix separating the
2 strands. - Using one of the strands as a template along
which to build the RNA molecule - Accomplished by Enzyme RNA polymerase (binds to
promoter and copies or transcribes the gene in
its full length) - Resulting molecule is called Pre-mRNA
- Single stranded pre-mRNA is then procesed.
- Splicing (mediated by spliceosome consisting of
RNA and proteins) removes the introns. - Ends modified (Capping modifies 5 end and
Polyadenylation adds adenines at the 3 end) to
enhance stability
20Translation (14/18)
- mRNA is used as a template to synthesize a
protein. - Translation takes place outside the nucleus in
the cytoplasm within organelles called
endoplasmic reticulum. - Except for the 5 3 end of the mRNA (which are
non-coding) the rest of the molecule codes for 1
protein - Proteins made up of aminoacids
- 20 different aminoacids used to build proteins in
humans - Each encoded by one or more sets of 3 nucleotides
(called triplets or codons) - Initial codon is always AUG (coding for
methionine) - Translation is terminated by one of 3 stop
codons. - Translation process is carried out by ribosomes
which scan the mRNA, build the polypetide chain
from aminoacids supplied by transport RNAs
(tRNA). - Starts at a particular location of the mRNA
called the translator start sequence (usually
AUG) - tRNA (transfer RNA) are made up of a group of
small RNA molecules each with specificity for a
particular amino acid. - tRNAs carry the aminoacids to the ribosomes, the
site of protein synthesis, where they are
attached to a growing polypetide. - Translation stops when one of UAA, UAG or UGA is
encountered
21Post-translational modification (DL)
- The polypetide chain that results from mRNA
translation is often subject to chemical
modifications. Eg. - Glycosylation, phosphorylation, hydrooxylation
- Addition of lipid groups (eg. Fatty acyl or
prenyl groups) - Addition of co-factors (e.g. a heme molecule)
- Or proteolytic cleavage
- The type of modification a protein undergoes
depends on its function and sub-cellular location.
22Genetic Code (15/18)
- The combination of nucleotides that build the
different codons represents the genetic code. - Codon 3 nucleotides 4 kinds of nucleotides. So
4X4X4 64 possible codons. - But 20 amminoacids start stop.
- So several codons can specify the same aminoacid.
(genetic code is degenerate) - Start codon (AUG) and Stop codons (UAA, UAG,
UGA). - Open reading frame (ORF) the sequence of
nucleotides between and including the start and
stop codons. - The Nobel Prize in Physiology or Medicine 1968
Holley, Khorana and Nirenberg - for their interpretation of the genetic code and
its function in protein synthesis - http//www.nobel.se/medicine/laureates/1968/
23Amino Acids with Codes (From Mishras slide)
- A Ala alanine GC(UACG)
- C Cys cysteine UG(UC)
- D Asp aspertic acid GA(UC)
- E Glu glutamic acid GA(GA)
- F Phe phenylanine UU(UC)
- G Gly glycine GG(UACG)
- H His histine CA(UC)
- I Ile isoleucine AU(UAC)
- K Lys lysine AA(AG)
- L Leu leucine (CU)U(AG) CU(UC)
- M Met methionine AUG
- N Asn asparginine AA(UC)
- P Pro proline CC(UACG)
- Q Gln glutamine CA(AG)
- R Arg arginine (AC)G(AG)CG(UC)
- S Ser serine (AGUC)(UC)UC(AG)
- T Thr threonine AC(UACG)
- V Val valine GU(UACG)
- W Trp tryptophan UGG
24Biological Function of Proteins
- Enzyme catalysis DNA polymerases, lactate
dehydrogenase, trypsin - Transport hemoglobin, membrane transporters,
serum albumin - Storage ovalbumin, egg-white protein, ferritin
- Motion myosin, actin, tubulin, flagellar
proteins - Structural and mechanical support collagen,
elastin, keratin, viral coat proteins - Defense antibodies, complement factors, blood
clotting factors, protease inhibitors - Signal transduction receptors, ion channels,
rhodopsin, G proteins, signalling cascade
proteins - Control of growth, differentiation and
metabolism repressor proteins, growth factors,
cytokines, bone morphogenic proteins, peptide
hormones, cell adhesion proteins - Toxins snake venoms, cholera toxin
25Differential Gene Expression 17/18
- All cells in the body (that contain a nucleus)
carry the full set of genetic information, but
only express about 20 of the genes at any
particular time. - Gene expression is selective
- Different proteins are expressed in different
cells according to the function of the cell. - Gene expression is tightly controlled and
regulated. - The differential expression of genes ensures that
cells develop correctly and can differentiate
into and function as specialized cell types. For
eg. Neurons, muscle cell, or fibroblast.
26cDNA and gene expression (DL)
- Goal Identify all possible genes expressed in
one tissue or cell line. (Use cDNA libraries) - cDNA libraries are prepared from mRNA isolated
from the cells or tissue being studied. - cDNA are DNA molecules that are complementary to
the mRNA sequences in a sample. - cDNA is synthesized by the enzyme reverse
transcriptase (RT), that uses the mRNA as a
tenplate. - RT is a viral enzyme used by viruses whose genome
is made of RNA, not DNA. - A cDNA library represents the collection of all
genes expressed in a particular cell or tissue
type. - DNA sequence ? mRNA sequence ? cDNA sequence
(much smaller as while generating mRNAs the
introns are eliminated) - Hence very useful when trying to isolate a
particular gene to study the protein it codes.
27NEXT SEC.
28Gene Cloning 1/11
- First step in identifying genes and their
function is to isolate it from the rest of genome
and produce a large quantity of it (called
cloning a gene). - Cloning a DNA fragment using bacteria
- DNA fragment is isolated from the entire genome
using restriction enzyme. - These enzymes can cut the DNA (in a staggered
fashion or straight through) at specific sites
defined by a short sequence. - Typically they recognize specific DNA sequences
of 4, 6, or 8 bases - These enzymes are found in bacterias, where their
role is to protect the bacteria from foreign DNA
by digesting them into smaller pieces - This fragment is inserted into a vector (like a
mini-chromosome) using DNA ligase and the
recombinant product is introduced into bacteria
(this process is called transformation) - Cloning vectors are DNA fragments that are able
to replicate within a cell and allow the addition
of exogenous DNA. - They are derived from plasmids, viruses, phages
or chromosomes. - Vectors are classified according to the type of
host cell they can replicate in, or the size of
the exogenous DNA they are able to carry. - The bacteria now makes new copies with every cell
division.
29DNA Sequencing (DL 1/11)
- It is the process designed to precisely determine
the sequence of bases in the DNA. - Involves enzymetically copying the DNA in the
presence of compounds that terminate this copying
process in a base specific manner, resulting in a
mixture of DNA copies that differ in size by one
base. - Different technologies are used to resolve the
mixture and detect the different fragments.
30Cloning issues (2-3/11)
- Clones from genomic DNA contain introns
(non-coding sequence) and is very large and
difficult to analyze for function. - Alternative start from mRNA. Convert to cDNA and
clone the cDNA.
31Gene function characterization(4/11)
- To characterize the function of a gene it is
important to know the sequence and compare it to
other sequences in the databases. Identify where
and under what condition it is expressed and what
function, if known, it has in other organisms. - Also do gene expression studies.
32Gene expression studies (5/11)
- Allow you to understand how a gene is regulated
in a tissue or a cell type. - Most useful way of studying gene expression is
by measuring the levels of mRNA produced from a
particular gene in a particular tissue. - Application to understand certain biological
process it is useful to study the differences in
gene expression which occur during such
processes. E.g. - It is of interest to know which genes are induced
or repressed, say in the liver, after a
particular drug is taken. - Or which genes are expressed in a tumor but not
in the surrounding normal tissue. - Some techniques for analyzing mRNA level of a
single gene or to quantify gene expression - Northern blots
- Quantitative reverse transcriptase PCR
(QT-RT-PCR) - DNA microarrays
- Proteomics (analysis of the protein synthesis
that results from gene expression)
33DNA microarrays (6/11)
- Consist of thousands of DNA probes corresponding
to different genes arranged as an array. - Each probe (sometimes consisting of a short
sequences of synthetic DNA) is complementary to a
different mRNA (or cDNA) - mRNA isolated from a tissue or cell type is
converted to fluoroscently labeled mRNA or cDNA
and is used to hybridize the array. - All expressed genes in the sample will bind to
one probe of the array and generate a fluoroscent
signal. - A DNA microarray can interrogate the level of
transcription of several thousand of different
genes from one sample in one experiment. (One DNA
microarray experiment reveals the mRNA levels of
1000s of genes from one tissue or cell type at
one time point) - Particularly useful when studying the effect of
environmental factors on gene expression. - A fingernail size chip can interrogate 10,000
different transcripts. Chip has 30-40 different
probes half of them are designed to perfectly
match 20 nucleotide stretches of the gene and the
other half contains a mismatch as a control to
test for specificity of the hybridization signal.
34Pharmacogenomics 7/11
- It refers to the study of differential gene
expression applied to drug discovery and
optimization. - Applications (Differential gene expression
studies in special tissues or cell types may) - Find new disease mechanisms of a drug
- Discover new drug targets
- Confirm expected action of mechanism of a drug
- Choose from best candidate compound based on
optimal expression profile. - Figure out apriori with who will benefit from a
drug and who wont.
35Model organisms 9/11
- Indispensable tool to study the function of a
gene. - Range from bacteria and yeast to animals amenable
to genetic modification. - Worms, insect cells, frog eggs, flies, zebra
fish, mice, mammalian (human) cell lines. - In general, more complex the organism more
difficult to do genetic modification, but more
relevant the model becomes to humans.