Structured Terminologies and Ontologies in the Biological Domain: SDD - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Structured Terminologies and Ontologies in the Biological Domain: SDD

Description:

Electronic Field Guide Project. OpenKey (converted by EFG) Transformation of Legacy Text ... with numerous (-few) flowers; bracts caducous; bracteoles lacking; ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 40
Provided by: PBryanH
Category:

less

Transcript and Presenter's Notes

Title: Structured Terminologies and Ontologies in the Biological Domain: SDD


1
Structured Terminologies and Ontologies in the
Biological Domain SDD
  • Bob Morris
  • And
  • P. Bryan Heidorn

2
Structure of Descriptive Data SDD
  • Taxonomic Databases Working Group
  • International Union of Biological Sciences
  • Support from NSF, Betty and Gordon Moore
    Foundation, Institute for Museum and Library
    Services

3
Structure of Descriptive Data
  • Started 1998 at the TDWG meeting in Reading.
  • Harvard in Oct 1999 it was agreed that the
    subgroup should attempt to analyze the
    requirements for a new interoperability standard
    for descriptive data.
  • Version 1.0 approved as standard in 2005 after St
    Petersburg

4
SDD
  • The metaformat for the standard will be based on
    XML and XML-schema.
  • It is hoped that this standard will reach
    universal recognition to become at some point a
    successor to existing standards like DELTA,
    NEXUS, or XDF.

5
Unified Biosciences Information Framework (UBIF)
  • UBIF is an attempt to define a common foundation
    for several TDWG/GBIF standards like SDD (see SDD
    WIKI), ABCD (see ABCD content schema homepage) or
    TaxonConceptNames (see Taxonomic Concept Transfer
    Schema WIKI).

6
SDD Creation
  • Lucid Export/Import
  • Electronic Field Guide Project
  • OpenKey (converted by EFG)
  • Transformation of Legacy Text
  • AMNH Legacy Lit Project, (PartsCompositionHierarc
    hy)
  • UBIO - LIF format to SDD evaluation

7
SDD XML Sample Overview
8
Main Elements of SDD
  • TechnicalMatadata
  • Matadata
  • TaxonNames
  • ClassHierarchy
  • Specimens
  • Agents
  • Publications
  • Geograpy
  • MediaResources
  • MeasurementUnits
  • Audience
  • Others
  • Descriptions

9
Descriptions
  • DescriptiveTeminology
  • NaturalLanguageDescriptions
  • CodedDescriptions
  • IdentificationKeys

10
Two Examples
  • EFG Butterflies
  • Designed by field naturalists
  • OpenKey Prairie plants and Trees with shared
    terminologies
  • Content by taxonomists

11
Ithomid Butterflies Godyris zygia (male dorsal)
http//www.cs.umb.edu/whaber/Monte/Ithomid/Imf/Go
dy-zava-md-550.jpg
12
Character matrix
13
Coded Descriptions
  • zavaleta"/
  • Male
  • CategoricalData.

Bebugref for humans only. Added programmatically
This is a reference to the taxon name space
14
Flashback to TaxonName section
  • zavaleta
  • sp
  • .
  • Godyris zavaleta
  • sp

Identification
15
Character matrix
16
Categorical Data
Again from the Terminology
  • wing ventral with line of spots"
  • OrSet
  • along margin)"/
  • .

One of several possible STATES
17
Character Definition
  • margin of hind wing ventral with line of
    spots
  • debugkey"Yes"/
  • debugkey"No"/

Character ID
Legal States Reference
18
Categorical Data
Again from the Terminology
  • wing ventral with line of spots"
  • OrSet
  • along margin)"/
  • .

One of several possible STATES
Modifier (Qualification)
19
Character matrix
20
Quantitative Characteristics
  • length"
  • mmiation

21
OpenKey
  • One Terminology across multiple description sets
  • Trees of Chapel Hill, NC Area
  • Illinois Bioindicator Prairie Plants
  • The terminology http//www.ibiblio.org/openkey/glo
    ssary/Character_and_Character_State_Definitions2.p
    df

22
Several Hundred Characteristics and over 1000
states
  • Growth Habit
  • Aquatic-emergent ? Growing in water with stem and
    leaves extending above the surface. (Compare with
    aquatic-floating and aquatic-submerged.)
  • Aquatic-floating ? Growing in water with leaves
    floating on the surface. (Compare with
    aquatic-emergent and aquatic-submerged.)
  • Aquatic-submerged ? Growing in water with stem
    and leaves beneath the surface. (Compare with
    aquatic-emergent and aquatic-floating.)
  • Broadleaf herbaceous ? Herbaceous with relatively
    broad leaves, thus differing from the long,
    narrow leaves of grasses (Poaceae) and other
    grass-like plants . (Compare with grass-like
    herbaceous.)
  • Epiphytic ? Physically supported in its entirety
    by another plant through all or the major part of
    its life, but not drawing direct nutrition from
    the host plant. (Compare with parasitic.) KP,
    p. 44, modified

23
(No Transcript)
24
(No Transcript)
25
Legacy Conversion through Machine Learning
26
Fig 6.1 MARTT System Architecture and Data Flow
(Cui, Dec 2004)
27
Knowledge Component
  • Domain knowledge is extracted from marked-up
    semi-structured floras FNA and FOC
  • Knowledge component is queried by markup system
    when marking up less structured collections FNCT
  • Queries
  • What are the probable classes for this set of
    terms?
  • What is the probability for element A to occur n
    positions relative to element B?
  • What is the probability for element A and B
    co-occurrence in one description?
  • Experiments show the knowledge component helps to
    improve markup performance

28
Baptisia HTML
29
Baptisia leucantha
30
Prairie Plant SDD XML
31
Shared Character and State Terminology
  • shape
  • debugkey"fan-shaped"/
  • debugkey"acicular"/
  • debugkey"awl-shaped"/
  • debugkey"clawed"/
  • debugkey"cordate"/
  • debugkey"spurred"/

32
Baptisia leucophae
  • Coded description 2831
  • OrSet

33
Prairie Plant SDD XML
34
Taxonomy
  • xonomicTree"
  • Genera
  • true

35
Genus Baptisia
  • leucantha"/
  • leucophaea"/

36
Prairie Plant SDD XML
37
  • leucantha"/
  • Discussion Species is tall
    and widely branched. Leavessmooth, shiny leaves,
    trifoliolate, petioled 1-2 cm except of
    uppermost leaflets petioluted ca. 0.5-1 mm,
    elliptic-obovate to oblanceolate, 2-6 (-8) cm
    long, (1.5-) 2-4.5 r (ratio) leaves are normally
    2 to 4.5 times longer than they are wide
    stipules small, caducous FlowersHolds flowering
    stem upright,has bright white flowers. Raceme(s)
    elongate (-short), with numerous (-few) flowers
    bracts caducous bracteoles lacking pedicels
    3-10 mm long. Calyx 7-8 mm long,glabrous, lobes
    shorter than tube corolla white, 2-2.5 cm long
    ovules (12-) 20. Stems0.5-2.0 dm tall or long,
    erect or divaricate, glabrous, commonly glaucous.
    Roots Seeds
  • Special diagnostic
    characters Itaposs large white flowers,
    deciduous bracts and stipules, and apically
    truncate, abruptly beaked pods distinquish it
    from other species.

38
Credits
  • Bob Morris, U Mass
  • Jacob Asiedu, U Mass
  • OpenKey Team

39
Additional Information
  • Electronicfieldguide project http//efg.cs.umb.edu
    /
  • OpenKey httpwww.isrl.uiuc.edu/openkey
  • SDD wiki http//wiki.cs.umb.edu/twiki/bin/view/SDD
    /WebHome
Write a Comment
User Comments (0)
About PowerShow.com