Title: 1 of 12
1Advanced Access
Mini module
2Access to Genome Annotation
- Release web site
http//www.ensembl.org/ - Pre-Release
http//pre.ensembl.org/ - Archive
http//archive.ensembl.org/ - BioMart http//www.ensembl.org/Mul
ti/martview/ -
http//www.biomart.org/biomart/martview/ - Downloads
ftp//ftp.ensembl.org/ - MySQL interface
ensembldb.ensembl.org - Perl API http//www.ensembl.org/in
fo/software/
3Downloads
- ftp//ftp.ensembl.org/pub/
- http//www.ensembl.org/info/downloads/ftp_site.htm
l - FASTA files plain sequence
- DNA (assembly masked and unmasked)
- cDNA (Ensembl and ab initio predictions)
- Peptides (Ensembl and ab initio predictions)
- RNA (non-coding RNA predictions)
- Flatfiles annotated 1Mb slices
- EMBL format
- GenBank format
- MySQL database table dumps
- GTF gene sets in GTF format
- EMF alignments of resequencing data in Ensembl
Multi Format
4Ensembl Databases
- Species-specific databases
- Core genomic sequences and
- most of the annotation
- Variation genetic variation
- Funcgen regulatory elements
- Otherfeatures EST genes
- Vega Vega genes
- Cross-species database
- Compara all comparative data
5MySQL
- SQL Structured Query Language
- Needed
- MySQL client program
- http//www.mysql.com
- Ability to write MySQL queries
- Knowledge of database schema
6MySQL
7MySQL
- Retrieve Ensembl Transcript and Peptide IDs for
ENSG00000010704 - mysql -u anonymous -h ensembldb.ensembl.org
- Welcome to the MySQL monitor. Commands end with
or \g. - Your MySQL connection id is 1699364 to server
version 4.1.20 - standard-log
- Type 'help' or '\h' for help. Type '\c' to clear
the buffer. - mysqlgt use homo_sapiens_core_41_36c
- Reading table information for completion of table
and column names - You can turn off this feature to get a quicker
startup with -A - Database changed
- mysqlgt SELECT gene_stable_id.stable_id AS gene,
transcript_stable_id.stable_id AS transcript,
translation_stable_id.stable_id AS peptide FROM
gene, transcript, translation, gene_stable_id,
transcript_stable_id, translation_stable_id WHERE
gene.gene_id transcript.gene_id AND
transcript.transcript_id translation.transcript_
id AND gene_stable_id.gene_id gene.gene_id AND
transcript_stable_id.transcript_id
transcript.transcript_id AND translation_stable_id
.translation_id translation.translation_id AND
gene_stable_id.stable_id 'ENSG00000010704'
8MySQL
Result --------------------------------------
------------- gene transcript
peptide ---------------------------
------------------------ ENSG00000010704
ENST00000309234 ENSP00000311698
ENSG00000010704 ENST00000349999
ENSP00000259699 ENSG00000010704
ENST00000317896 ENSP00000313776
ENSG00000010704 ENST00000353147
ENSP00000312342 ENSG00000010704
ENST00000352392 ENSP00000315936
ENSG00000010704 ENST00000336625
ENSP00000337819 ENSG00000010704
ENST00000345823 ENSP00000344033
ENSG00000010704 ENST00000357618
ENSP00000350238 ENSG00000010704
ENST00000317880 ENSP00000313489
---------------------------------------------
------
9Perl API
- API Application Programming Interface
- Needed
- BioPerl modules
- Ensembl modules
- Ability to code in Perl
- For more information (installation instructions,
- tutorials, documentation etc.)
- http//www.ensembl.org/info/using/api/index.html
10Perl API
Retrieve Ensembl Transcript and Peptide IDs for
ENSG00000010704 !/usr/local/ensembl/bin/perl us
e strict use warnings use BioEnsEMBLRegistr
y my reg "BioEnsEMBLRegistry" reg-gtload
_registry_from_db( -host gt 'ensembldb.ensembl.or
g', -user gt 'anonymous') my gene_adaptor
reg-gtget_adaptor ("human", "core", "Gene") my
gene gene_adaptor-gtfetch_by_stable_id('ENSG000
00010704') my _at_transcripts _at_gene-gtget_all_Tra
nscripts() print "Gene\t\tTranscript\tPeptide\n
" foreach my transcript(_at_transcripts)
print gene-gtstable_id, "\t", transcript-gtstable
_id, "\t", transcript-gttranslation-gtstab
le_id, "\n"
11Perl API
Result Gene Transcript
Peptide ENSG00000010704 ENST00000309234
ENSP00000311698 ENSG00000010704 ENST00000349999
ENSP00000259699 ENSG00000010704 ENST00000317896
ENSP00000313776 ENSG00000010704 ENST00000353147
ENSP00000312342 ENSG00000010704 ENST00000352392
ENSP00000315936 ENSG00000010704 ENST00000336625
ENSP00000337819 ENSG00000010704 ENST00000345823
ENSP00000344033 ENSG00000010704 ENST00000357618
ENSP00000350238 ENSG00000010704 ENST00000317880
ENSP00000313489
12Q
A
Q U E S T I O N S A N S W E R S