UniRef Sequence clusters - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

UniRef Sequence clusters

Description:

One UniRef90 entry groups sequences that have 90% or more identity across ... Swiss Institute of Bioinformatics (SIB) European Bioinformatics Institute (EMBL-EBI) ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 2
Provided by: naf74
Category:

less

Transcript and Presenter's Notes

Title: UniRef Sequence clusters


1
UniProt The Universal Protein Resource
  • The mission of UniProt is to provide the
    scientific community with a comprehensive,
    high-quality and freely accessible resource of
    protein sequence and functional information.
  • UniProt provides four databases, each optimized
    for different usesUniProtKB, UniRef, UniParc and
    UniMES.
  • UniProt is produced by the UniProt Consortium
    (SIB, EBI, PIR)

Contact help_at_uniprot.org Web site
www.uniprot.org
UniProtKB - Protein sequence knowledgebase
UniProtKB gives access to publicly available
protein sequences. UniProtKB is composed of 2
sections
  • UniProtKB/Swiss-Prot
  • Manually annotated
  • (Reviewed)
  • Merge of all sequences derived from the same
    gene and the same species low redundancy and
    high accuracy of the protein sequence
  • Integration of biological data from
    publications, external expertise, as well as
    high-performance bioinformatic tools, etc.
    high-quality manual annotation
  • Addition of cross-references to relevant
    databases links to about 100 databases are
    available central hub for biological data.

UniRef - Sequence clusters
  • UniRef is useful for comprehensive BLAST
    similarity searches by providing 3 collections of
    sequence clusters (UniRef100, UniRef90 and
    UniRef50) based on UniProtKB and selected UniParc
    records.
  • One UniRef90 entry groups sequences that have
    90 or more identity across species (database
    reduction of 40)
  • The use of UniRef50 reduces the size of the
    database of 65.
  • UniProtKB/TrEMBL
  • Computer annotated
  • (Unreviewed)
  • Merge of 100 identical sequences from the same
    organism
  • Protein family and domain attribution
    (InterPro)
  • Automated annotation.

UniParc - Sequence archive
  • UniParc allows the tracking of a protein sequence
    and of its integration into various databases.
  • UniParc gives access to archived non-redundant
    protein sequences (records active or not) found
    in publicly accessible databases (UniProtKB, PIR,
    EMBL, Ensembl, IPI, PDB, RefSeq, FlyBase,
    WormBase, Patent Offices).
  • Each entry contains a protein sequence, taxonomy
    data and cross-references to source databases.
  • Use with caution also contains pseudogenes,
    incorrect CDS prediction, etc.

EMBL/GenBank/DDBJ, Ensembl, VEGA, RefSeq, other
protein resources
UniMES Metagenomic and Environmental Sequences

Currently the database contains only data from
the Global Ocean Sampling Expedition (GOS).
UniMES is released in FASTA format together with
an UniMES matches to InterPro method file.
Swiss Institute of Bioinformatics
(SIB) European Bioinformatics Institute
(EMBL-EBI) Protein Information Resource (PIR)
UniProt is mainly supported by the National
Institutes of Health (NIH) grant 2 U01
HG02712-04. Additional support for the EBI's
involvement in UniProt comes from the European
Commission (EC)'s FELICS grant (021902RII3) and
from the NIH grant 1R01HGO2273-01.
UniProtKB/Swiss-Prot activities at the SIB are
supported by the Swiss Federal Government through
the Federal Office of Education and Science. PIR
activities are also supported by the NIH grants
and contracts HHSN266200400061C, NCI-caBIG, and
1R01GM080646-01, and the National Science
Foundation (NSF) grant IIS-0430743.
Write a Comment
User Comments (0)
About PowerShow.com