Open Source Solutions for Tissue Banking Informatics - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Open Source Solutions for Tissue Banking Informatics

Description:

1. Generalize (don't specialize) ... Would you like to write a. Tissue Respository/Tissue Informatics. book? jjberman_at_alum.mit.edu ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 24
Provided by: julesb5
Category:

less

Transcript and Presenter's Notes

Title: Open Source Solutions for Tissue Banking Informatics


1
Open Source Solutions forTissue Banking
Informatics
  • Jules J. Berman, Ph.D., M.D.
  • INFORMATICS FOR REPOSITORIES
  • Wednesday, May 21, 2008
  • 330 pm 405 pm

2
Approaches to finding open source solutions
  • 1. Generalize (don't specialize). Wherever
    possible, don't think of your tissue repository
    problems as being unique. Try to think of your
    problems as instances of very general informatics
    problems.
  • In most cases, the same open source solutions
    that work for bioinformaticians, astronomers, and
    factory inventories will likely work for you

3
Approaches to finding open source solutions
  • 2. Learn a popular open source programming
    language that is easy to learn and that is
    supplemented by an enthusiastic biomedical
    community
  • Perl
  • Python
  • Ruby

4
Approaches to finding open source solutions
  • 3. Use open source, unencumbered nomenclatures,
    codes, syntactic formats. Otherwise, can't share
    or post data through web
  • MESH (standard, open source, free)
  • UMLS (standard, encumbered)
  • SNOMED (standard, encumbered)
  • Neoplasm Classification (non-standard, open
    source, free, standard syntax XML, RDF)
  • http//www.julesberman.info/

5
Approaches to finding open source solutions
  • 4. Use an open source and general data syntax
  • HTML (formatting and linking)
  • XML (describing data)
  • RDF (getting meaning from described data)

6
(No Transcript)
7
All data can be specified using RDF, developed by
the W3C. RDF files are collections of
statements expressed as data triples ltidentified
subjectgtltmetadatagtltdatagt Jules Berman blood
glucose level 85 Mary Smith eye color
brown Samuel Rice eye color blue Jules
Berman eye color brown When you bind a
key/value pair to a specified object, you're
moving from the realm of data structure (i.e.,
XML) into the realm of data meaning.
8
RDF permits data to be merged between different
files
Medical file Jules Berman blood glucose
level 85 Mary Smith eye color
brown Samuel Rice eye color blue Jules
Berman eye color brown
Merged Jules Berman database Jules Berman
blood glucose level 85 Jules Berman eye
color brown Jules Berman hat size 9
Hat file Sally Frann hat size 8 Jules
Berman hat size 9 Fred Garfield hat size
9 Fred Garfield hat_type bowler
9
(No Transcript)
10
Approaches...
  • 5. Use open source utilities not software
    applications (open source or otherwise)l
  • Utilities are simple programs that do one type of
    job, very well. Often work from command-line
    (i.e., no GUI)
  • Once you've mastered a dozen or so utilities, you
  • can handle most informatics task that you'll come
    across.
  • Applications are often complex and seldom provide
    the functionality you need (now or future).

11
Approaches ...
  • 6. Learn the algorithms for your discipline.
  • Algorithms are process descriptions that work
    every time.
  • Most informatics algorithms can be implemented in
    under ten lines of software code
  • You can think of software applications as many
    algorithms working under a GUI
  • If you really understand algorithms, you can make
    important contributions to your field.

12
Approaches...
  • 7. De-emphasize standards.
  • Most standards are difficult to understand, and
    there are many of them, often covering obscure
    domains. Many standards are just bad.
  • Data kept in a standard today may be non-standard
    legacy data tomorrow.
  • Unlike physical standards, standards are
    transformable (so why fuss over any one
    standard?).
  • Standards can be encumbered

13
(No Transcript)
14
  • Specifications often a better solution than
    Standards
  • Specifications are just descriptions of your
    data.
  • A specification requires a common language for
    describing data (so that you and your computer
    can understand what it's trying to convey).
  • Specifications give you enormous freedom to
    create and describe new and unconventional data
    objects.
  • Usually done in RDF
  • If you've specified your data well, you can port
    between standards when you need to.

15
  • Example Pathology image annotation

16
  • Important descriptors of an image might include
  • File information
  • Image capture information
  • Image format information
  • Specimen information
  • Patient information
  • Pathology information
  • Region of interest information

17
  • JPEG is an image format that is used by millions
    of people in all types of professions, including
    the medical profession
  • JPEG can now be used without worrying about IP
    issues
  • You can put any information you want into the
    header of a JPEG image (including an RDF
    document) so that specified clinical/pathological
    information can be conveyed with the image
  • Because images non-physical, it is usually easy
    to interconvert image formats

18
By annotating our images, we can ensure that the
image conveys meaning and value By using RDF, we
can ensure that the individual triples can be
integrated with heterogeneous data sources beyond
those of images. By using pre-existing
international general standards for describing
any kind of data, we attain interoperability and
avoid the confusion and complexity that occurs
whenever a new standard is created. See
http//www.julesberman.info/spec2img.htm
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
  • Would you like to write a
  • Tissue Respository/Tissue Informatics
  • book?
  • jjberman_at_alum.mit.edu
Write a Comment
User Comments (0)
About PowerShow.com