Title: Proteins and Protein Function
1Proteins and Protein Function
2Amino Acids
- General structure of an amino acid
- 20 standard amino acids each with a different R
group
3Amino Acids
Table 1. 20 standard amino acids
Amino Acid 3-letter code 1-letter code
Alanine Ala A
Arginine Arg R
Asparagine Asn N
Aspartate Asp D
Cysteine Cys C
Glutamine Gln Q
Glutamate Glu E
Glycine Gly G
Histidine His H
Isoleucine Ile I
4Amino Acids
Table 1. 20 standard amino acids (Cont.)
Amino Acid 3-letter code 1-letter code
Leucine Leu L
Lysine Lys K
Methionine Met M
Phenylalanine Phe F
Proline Pro P
Serine Ser S
Threonine Thr T
Tryptophan Trp W
Tyrosine Tyr Y
Valine Val V
5Amino Acids
Amino Acid 3-letter code 1-letter code
Asparagine (N) or aspartate (D) Asx B
Glutamine (Q) or glutamate (E) Glx Z
Any amino acid Xaa X
Amino Acid Abbreviations (IUPAC)
Authority IUPAC-IUB Joint Commission on
Biochemical Nomenclature. Reference
IUPAC-IUB Joint Commission on Biochemical
Nomenclature. Nomenclature and
Symbolism for Amino Acids and Peptides.
Eur. J. Biochem.
1389-37(1984).
6Proteins
- Two separate amino acids can be linked together
by a peptide bond - A chain of amino acids linked by peptide bonds is
called a polypeptide. - A protein is made up of one or more polypeptide
chains - For simplicity, in this course, a protein is a
chain of amino acids linked by peptide bonds,
e.g. - VSQLLKQRVRYAPYLSKVRRAEELLPLFKHGQYIGWSGFTGVGAPK
VI
7Protein Database
- UniProt (Universal Protein Resource)
(http//www.pir.uniprot.org/) is the world's most
comprehensive catalog of information on proteins.
It is a collaboration between - Swiss Institute of Bioinformatics (SIB)
- Department of Bioinformatics and Structural
Biology of the Geneva University - European Bioinformatics Institute (EBI)
- Georgetown University Medical Center's Protein
Information Resource (PIR) - It includes three components
8Protein Database
- UniProt Knowledgebase (UniProtKB) the central
access point for extensive curated protein
information. - UniProtKB/Swiss-Prot a manually annotated
protein sequence database which provide a high
level of annotation, a minimal level of
redundancy and high level of integration with
other databases. UniProtKB/Swiss-Prot Release
48.7 of 20-Dec-2005 204,086 entries - UniProtKB/TrEMBL a computer-annotated supplement
of Swiss-Prot that contains all the translations
of EMBL nucleotide sequence entries not yet
integrated in Swiss-Prot. UniProtKB/TrEMBL
Release 31.7 of 20-Dec-2005 2,506,886 entries - UniProt Reference Clusters (UniRef) databases
combine closely related sequences into a single
record to speed searches. - UniProt Archive (UniParc) a comprehensive
repository, reflecting the history of all protein
sequences
9Protein Database
10Protein Database
11Protein Database
12Protein Database
13Protein Database
14(No Transcript)
15Gene Ontology
Goal find all the proteins that are involved
protein synthesis
Translation
Protein synthesis
16Gene Ontology
Golf
I like golf.
Me too!
17Gene Ontology
- Ontology
- n. the branch of metaphysics dealing with
the nature of being. - (The New Oxford American Dictionary, Edited by
Elizabeth J. Jewell, Frank Abate, Oxford
University Press, 2001,pp 1197.) - Metaphysics
- n. the branch of philosophy that deals with the
first principles of things, including abstract
concepts such as being, knowing, substance,
cause, identity, time, and space. - (The New Oxford American Dictionary, Edited by
Elizabeth J. Jewell, Frank Abate, Oxford
University Press, 2001,pp 1074.)
18Gene Ontology
- The Gene Ontology (GO) (http//www.geneontology.or
g/) project is a collaborative effort to address
the need for consistent descriptions of gene
products in different databases. The project
began as a collaboration between three model
organism databases FlyBase (Drosophila),the
Saccharomyces Genome Database (SGD) and the Mouse
Genome Database (MGD) in 1998. Since then, the GO
Consortium has grown to include many databases,
including several of the world's major
repositories for plant, animal and microbial
genomes.
19Gene Ontology
- Develop structured, controlled vocabularies
(ontologies) that describe gene products - Make associations between the ontologies and the
genes and gene products in the collaborating
databases, - Develop tools that facilitate the creation,
maintainence and use of ontologies - The use of GO terms facilitates uniform queries
across databases
20Gene Ontology
- The three components of GO are molecular
function, biological process and cellular
component - GO terms are organized in structures called
directed acyclic graphs (DAGs), which differ from
hierarchies in that a child, or more specialized,
term can have many parent, or less specialized,
terms
monosaccharide biosynthesis
hexose metabolism
hexose biosynthesis
21Gene Ontology
- The controlled vocabularies are structured so
that you can query them at different levels - GO browser AmiGO (http//www.godatabase.org/cgi-bi
n/amigo/go.cgi)
22(No Transcript)
23Protein function
- Three steps to get a set of proteins that have a
certain function - Search for the GO term
- (http//www.godatabase.org/cgi-bin/amigo/go.cgi)
- Search for the proteins belong to a certain GO
- (http//www.pir.uniprot.org/search/textSearch.shtm
l) - Save the sequence in FASTA format
24Search for the GO
25Search for the proteins belong to a certain GO
26Save sequences in FASTA format