The Distributed Annotation System - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

The Distributed Annotation System

Description:

Distributed Annotation System (DAS) is an effort to share annotation amongst ... annotation can be associated with a glyph' which describes how it should be ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 16
Provided by: jeremysi
Category:

less

Transcript and Presenter's Notes

Title: The Distributed Annotation System


1
The Distributed Annotation System
  • Dowell, R.D., Jokerst, R.M., Day, A., Eddy, S.R.,
    and Stein, L. (2001)

2
DAS
  • Distributed Annotation System (DAS) is an effort
    to share annotation amongst multiple groups.
  • An XML specification is used to enforce
    compatibility.
  • Byproduct of NIH NHGRI C. elegans genome project

3
Ideal Annotation System
  • Should give individual experts the ability to
    contribute the collective annotation.
  • They should have complete control over their
    annotations.
  • These annotations should not need approval from a
    central authority.

4
Problems with Centralization
  • Centralized repositories such as GenBank act
    primarily as storage databases for sequence.
  • Annotation in these systems is made difficult and
    in some cases nearly impossible.

5
Curation
  • Databases have emerged to act as curators for
    particular communities
  • Swissprot
  • RefSeq
  • WormPD
  • ACeDB

6
Curation (continued)
  • Individuals submit annotation to a central
    curation group.
  • Approved annotations are included in official
    releases.
  • Limited curators cause bottlenecks.

7
The DAS Solution
  • DAS allows experts to post annotations.
  • Rather than presenting to a centralized group for
    consideration, the annotation is added to one of
    many third party servers.
  • The client is then responsible for accessing the
    server and annotation.

8
Implementation
  • 1 Reference Server
  • DAS relies on a common reference sequence
  • 1 or more Annotation Servers
  • Responsible for list of annotations across a
    defined region of the genome.
  • 1 Client
  • Allows the user to request annotations from
    specific servers across a specific reference
    sequence.

9
Reference Server
  • Responsible for supplying the reference map and
    the reference sequence.
  • Reference Map
  • These may be entire chromosomes, a collection of
    contigs, or even individual clones.
  • The entry point is the top level structure.
  • Substructure is defined by start and stop points
    within the entry point.
  • Substructure is recursive.
  • Reference Sequence
  • The underlying DNA.

10
Annotation Server
  • Responsible for supplying annotations for a
    specified region of the reference sequence.
  • Annotations are related to regions of the genome
    by relative start and stop points.

11
Annotations
  • The details of an annotation are kept very non
    specific using the General Feature Format (GFF)
  • Annotators are encouraged to provide links to a
    more applicable description.
  • An annotation can be associated with a glyph
    which describes how it should be displayed to the
    user.

12
Annotations (continued)
  • DAS does not impose a controlled vocabulary for
    annotation descriptions.
  • An annotation does have three attributes
  • Type - biological significance
  • Ex tRNA, snoRNA, miscRNA
  • Method - how annotation was discovered
  • Ex any specific software or typing methods
  • Category - functional category
  • Ex Homology, variation, transcribed

13
Client
  • The client is responsible for allowing the user
    to examine a particular part of the genome for
    annotations located on multiple third party
    servers.
  • Only the relative position is considered, so no
    semantic contradictions are considered.
  • The current tools are optimal for visualizing and
    indexing annotations.

14
Why XML?
  • Strongly typed, extensible data exchange format.
  • At the cost of non-trivial bandwidth demands.
  • A user can easily request more data than their
    connection can reasonably handle.
  • This is why very little information on an
    annotation is returned.
  • Data compression may resolve in the future.

15
The Future of DAS
  • As of 2001, this papers publication, Ensembls
    database of human genome annotations supports
    DAS.
  • As of 2006, DAS servers are running at WormBase,
    FlyBase, Ensembl, TIGR, and UCSC.
  • Biodas.org
Write a Comment
User Comments (0)
About PowerShow.com