Image Bioinformatics - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Image Bioinformatics

Description:

in drosophila (fruit fly) spermatogenesis ... these images clearly show gene expression at different stages of spermatogenesis ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 34
Provided by: grah88
Category:

less

Transcript and Presenter's Notes

Title: Image Bioinformatics


1
Image Bioinformatics
  • Image semantics in life sciences research

Graham Klyne Image Bioinformatics Research Group
of the Oxford e-Science Centre Department of
Zoology University of Oxford, UK
2
Outline
  • Introduction - Image Bioinformatics Research
    Group
  • BioImage database - key technologies
  • Drosophila Testis Gene Expression Database
    Project
  • Future plans and aspirations
  • Round-up and discussion

3
IntroductionImage Bioinformatics Research Group
  • We are David Shotton, Chris Catton, Graham
    Klyne, Liz Mellings, based in the Zoology
    Department at Oxford
  • Backgrounds in cell biology, microscopy, animal
    behaviour, video, database design, ontology,
    Internet and Web standards, Semantic Web, data
    curation
  • Part of the Oxford e-Science Centre
  • Drawing on expertise and standards in biology,
    computing and ontologies, applied to
    life-sciences research

4
Image Bioinformatics
  • Images (and other media) are fundamental records
    in bioscience
  • Vast amounts of raw data, not readily amenable to
    automatic interpretation or indexing
  • Acquisition is often costly and time consuming
    metadata increases value
  • Only summaries can be published in traditional
    journals
  • Increasingly, bioinformatics research is in
    silico, mining data from diverse online sources
  • Alternative routes to publication are needed for
    research data

5
The current state of the art
  • Concerning the image data you requested - this
    is a tough one. The image was recorded about ten
    years ago, and I never managed to write a paper
    about the work so it was never published. The
    original data (if they still exist) must be on
    some magneto-optical disk in one of many boxes in
    my flat - quite hopeless to find at short notice.
    All I can promise is that Ill look into this
    once I am back from my travels but that will
    take a few months. Whether anyone still has
    hardware capable of reading the disc is quite
    another matter! Sorry about this.
  • anon

6
Technical Goals
  • Storage technology is not a goal for us
  • Assembly of systems to capture and publish
    research images and metadata with associated
    high-level descriptions
  • Preserving the association between raw data and
    high-level descriptions
  • Provide access to data in terms of research
    domain concepts
  • Combine research image data with other online
    resources (gene databases, literature databases,
    etc.)
  • Web-style interoperability and evolveability

7
The BioImage Project
  • Images are semantic instruments for capturing
    aspects of the real world, and form a vital part
    of the scientific record, for which words are no
    substitute
  • In the post-genomic world, attention is now
    focused on the organization and integration of
    information within cells, for functional analyses
    of gene products
  • In a month a single active cell biology lab may
    generate between 10 and 100 Gbytes of
    multidimensional image data

8
So we built a database
9
Key Technologies
  • Ontologies
  • An ontology is a formal, explicit specification
    of a shared conceptualisation Studer 1998,
    after Gruber
  • Controlled vocabulary for expressing high-level
    biological concepts (BioImage uses this to
    construct a user interface)
  • Formal constraints (based on Description Logics)
    capture elements of biological domain knowledge
  • Inference can confirm existing knowledge and
    suggest new facts
  • Semantic Web
  • RDF provides a standard format and formal
    semantics for exchanging ground facts
  • OWL is a standard for ontology definitions

10
Other Technologies
  • Java, Jena Semantic Web toolkit
  • Postgres SQL
  • Apache Tomcat, Java Servlets, Struts
  • XML, XSLT, STXX, SiteMesh
  • Protégé ontology builder, inference systems
  • Agile development, Junit, Cactus, etc.
  • Also applied to information design
  • etc.

11
BioImage overview
12
Drosophila Testis Gene Expression Database
(DTGED) Project
  • Research the function of genes whose expression
    is dependent on specific (aly-class) proteins
  • PIs Dr Helen White-Cooper and Dr David Shotton
  • We are working closely with the DTGED research
    team based in the Zoology Department, University
    of Oxford
  • Genes code for the production of complex
    chemicals (enzymes, proteins, etc) used in
    biological processes
  • But the expression of any gene is dependent on
    the cell environment, including the presence of
    other gene products
  • Observable biological consequences (phenotypes)
    may result from subtle interactions between many
    gene products and other factors
  • This project aims to document such interactions
    in drosophila (fruit fly) spermatogenesis

13
(No Transcript)
14
(No Transcript)
15
Images of Expression Patterns
  • To an expert observer these images clearly show
    gene expression at different stages of
    spermatogenesis
  • Each image corresponds to a different combination
    of gene and a strain of drosophila
  • These in situ hybridization images are the end
    game - the final stage of a non-trivial process
    of screening and preparation
  • Reproducibility and interpretation requires that
    the preparatory steps are recorded along with the
    images

CG2247 wt
CG2247 topi
CG12907 aly
CG12907 topi
16
So how is it done? (1)
17
So how is it done? (2)
18
DTGED Experimental Data Flows
19
DTGED Technologies Used
  • Minimalist approach to development working with
    available web-based tools, etc.
  • ProtégéRacer (DL reasoner) for design and
    testing of ontologies for experimental data
  • Note that expert annotations are open-ended
  • BioImage for Ontology-directed capture and
    staging of annotations and observations
  • Extends original purpose of BioImage
  • Open Microscopy Environment (OME) for capture and
    staging of images and image metadata
  • Haskell for conditioning Excel spreadsheet data
    and combining it with other data sources
  • BioImage for publication of images and metadata

20
Future Work and Aspirations
  • OntoImage
  • Kaleidoscope
  • V-Lab
  • The Ontogenesis Network - Evolving Community
    Ontologies
  • Standard Animal Behaviour Ontology (SABO)
  • Feedback to open standards communities

21
OntoImage
  • Extend BioImage with enhanced queries
  • Incorporate knowledge from external ontologies
  • Cross-reference data from multiple sources
  • Composite query planning
  • Multi-faceted query presentation (Kaleidoscope)
  • Additional data curation
  • Plants A Prototype Arabidopsis Image Database
  • Animal behaviour A Video Collection of Mouse
    Behaviours
  • Mammals Genes of the Mammalian Secretory Pathway

22
Kaleidoscope
  • Interface and faceted presentation for queries
    over multi-source data
  • (Former working name ImageBLAST)
  • Early proposed interface design illustrations
    follow
  • Front page
  • Hypersearch entry form
  • Search results - Drosophila gene database
    (FlyBase)
  • Search results - BioImage
  • Search Results - Protein folding simulations
    (PDB)

23
Kaleidoscope - proposed front page
24
Kaleidoscope - hypersearch interface
25
Kaleidoscope - search results
26
Kaleidoscope - search results
27
Kaleidoscope - search results
28
V-Lab
  • Collaborative video annotation
  • Biological applications include animal behaviour,
    bacteriology, embryonic cell division also,
    possible applications in medicine, sports,
    psychology, education, ergonomics...
  • Capture expert observations about video contents
  • Attach annotations to specific frames and/or
    regions of a video sequence
  • Software agents as assistants
  • e.g. motion tracking
  • Publication via BioImage

29
Evolving OntologiesThe Ontogenesis network
  • Currently with participants from
  • Oxford - Manchester - Cambridge - FreshwaterLife
    (Cumbria)
  • Engineering usable ontologies
  • Dealing with advances in domain knowledge
  • Re-evaluating old data in light of new knowledge
  • Reasoning with provenance of information
  • Resolving semantic conflicts

30
Summary
  • Image Bioinformatics deploys a collection of
    tools for annotation and publication of
    multidimensional image data
  • We aim to assemble a diverse open-source toolkit,
    using existing components as much as possible
  • Semantic rigour is needed for interoperable
    capture of expert observations
  • Information requirements are open-ended
    extensibility and evolvability are key goals
  • Information design as much as software design
  • We are a small, application-focused group seeking
    to work with appropriate technical expert
    collaborators

31
Questions, discussion
http//bioimage.ontonet.org/moin/IbrgPresentations
?actionAttachFiledoget target20050727-IB
RG-presentation.ppt
32
Other notes
  • Why RDF?
  • Missing isnt broken Brickley
  • Aggregation is free
  • Evolvability
  • Formal semantic framework
  • basis for meaning- (or truth-) preserving
    transformations
  • A little inference goes a long way Hendler

33
BioImage key features
  • Concepts vs keywords
  • abstract away from representation
  • Model-View-Controller architecture
  • separate data model from presentation and
    processing logic
  • Link to separately defined domain knowledge
  • Gene Ontology (GO), Microarray (MGED), NCBI
    taxonomy, etc.
  • Evolution
  • adopting new ontologies
  • re-purposing old data
  • Truth-preserving aggregation
  • Ontology-guided user interface
  • for submission and query
  • Image and non-image data
Write a Comment
User Comments (0)
About PowerShow.com