Title: Ecological Metadata Language Overview
1Ecological Metadata Language Overview
- KNB Data Management Tools Workshop
- Christopher Jones
- Marine Science Institute
- University of California, Santa Barbara
2Agenda
- Goals of EML
- Module Organization
- Understanding the structure
- Practical Usage
- Development, Documentation and Communication
3Goals of EML
- Address lack of dataset documentation
4Goals of EML
- Address lack of dataset documentation
- Provide structure to traditionally unstructured
information
5Goals of EML
- Address lack of dataset documentation
- Provide structure to traditionally unstructured
information - Aid transition toward synthetic research
6Goals of EML
- Address lack of dataset documentation
- Provide structure to traditionally unstructured
information - Aid transition toward synthetic research
- Enable both human-readable and machine-readable
metadata
7Goals of EML
- Address lack of dataset documentation
- Provide structure to traditionally unstructured
information - Aid transition toward synthetic research
- Enable both human-readable and machine-readable
metadata - Enable long-term archives
8Goals of EML
- Address lack of dataset documentation
- Provide structure to traditionally unstructured
information - Aid transition toward synthetic research
- Enable both human-readable and machine-readable
metadata - Enable long-term archives
- Simple transfer format just a text document
9Goals of EML
- Address lack of dataset documentation
- Provide structure to traditionally unstructured
information - Aid transition toward synthetic research
- Enable both human-readable and machine-readable
metadata - Enable long-term archives
- Simple transfer format just a text document
- Conform to accepted Internet-based standards for
the implementation of EML (XML, XML Schema)
10Goals of EML
- Address lack of dataset documentation
- Provide structure to traditionally unstructured
information - Aid transition toward synthetic research
- Enable both human-readable and machine-readable
metadata - Enable long-term archives
- Simple transfer format just a text document
- Conform to accepted Internet-based standards for
the implementation of EML (XML, XML Schema) - Community-driven development process
11Goals of EML
- Address lack of dataset documentation
- Provide structure to traditionally unstructured
information - Aid transition toward synthetic research
- Enable both human-readable and machine-readable
metadata - Enable long-term archives
- Simple transfer format just a text document
- Conform to accepted Internet-based standards for
the implementation of EML (XML, XML Schema) - Community-driven development process
- Intended to be both ancillary and integral to
processing data
12A simple EML example
13Can be created in a text editor
lt?xml version"1.0"?gt ltemleml
packageId"sbclter.316.18" system"knb"
xmlnseml"eml//ecoinformatics.org/eml-2.0.1"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"eml//ecoinformatics.org
/eml-2.0.1 eml.xsd" gt ltdatasetgt lttitlegt
Kelp Forest Community Dynamics Benthic Fish
lt/titlegt ltcreatorgt ltindividualNamegt
ltsurNamegtReedlt/surNamegt
lt/individualNamegt lt/creatorgt ltcontactgt
ltindividualNamegt ltsurNamegtEvanslt/surNam
egt lt/individualNamegt lt/contactgt
lt/datasetgt lt/emlemlgt
14Can be created in custom applications
15Agenda
- Goals of EML
- Module Organization
- Understanding the structure
- Practical Usage
- Development, Documentation and Communication
16Module Organization
- Organized as discreet, logical units
- Modular design for re-use of sections
- Self describing field documentation is embedded
in the EML files themselves
Pete Taylor photo
17Module Organization (contd)
- Top level eml module is a container, relies on
sub-modules, also provides identification
information - Is extensible in order to accommodate additional
information
18Module Organization (contd)
- 4 basic resource types are available
- Each type has unique fields unto itself
19Module Organization (contd)
- However, they share a set of common fields since
they may all be considered digital resources - These align with Dublin Core metadata containers
20Module Organization (contd)
- Support modules augment each resource with
broader information - Provide details for data discovery,
interpretation, quality
21Module Organization (contd)
- Dataset organization modules provide details on
logical structure critical to machine processing
of data - Provide schema information objects and
relationships
22Module Organization (contd)
- Entities are the detailed data descriptions,
provide machine-readable details for dataset
parsing - Provide syntactic information unique to each data
type
23Module Organization (contd)
- Entities are the detailed data descriptions, and
provide machine-readable details for dataset
processing - Provide syntactic information unique to each data
type
24Module Organization (contd)
- spatialReference defines fields for coordinate
systems for referencing spatial coordinates of a
dataset to the earth (e.g. NAD_1983_UTM_Zone_18N)
25Module Organization (contd)
- The EML unit dictionary provides pre-defined
names of units for dataset attributes that are
mapped back to SI units, with definitions for
conversion factors, etc.
attribute
26Module Organization (contd)
- The text module is a utility module that provides
formatting markup for many fields within EML.
This allows for structuring paragraphs, bulleted
lists, etc.
27Module Organization (contd)
- eml module itself has an extensible section
called additionalMetadata - allows for any XML fields to be added to the
document
additionalMetadata
28Agenda
- Goals of EML
- Module Organization
- Understanding the structure
- Practical Usage
- Development, Documentation and Communication
29Top level EML schema structure
- The EML schema is a hierarchical organization
of the previously discussed modules - Referred to as a tree of information, with
sub-trees - Some elements are optional, some required, etc.
30Top level EML schema structure
note diagram change
31Understanding the diagrams
ltemlgt lt/emlgt
element
32Understanding the diagrams
ltemlgt ltadditionalMetadatagt
lt/additionalMetadatagt lt/emlgt
optional
33Understanding the diagrams
ltemlgt ltdatasetgt lt/datasetgt
ltadditionalMetadatagt lt/additionalMetadatagt
lt/emlgt
sequence
34Understanding the diagrams
ltemlgt ltdatasetgt lt/datasetgt
ltadditionalMetadatagt lt/additionalMetadatagt
lt/emlgt
choice
35Understanding the diagrams
ltemlgt ltdatasetgt lt/datasetgt
ltadditionalMetadatagt lt/additionalMetadatagt
ltadditionalMetadatagt
lt/additionalMetadatagt lt/emlgt
cardinality
36Understanding the diagrams
ltemlgt ltdatasetgt lt/datasetgt
ltadditionalMetadatagt lt/additionalMetadatagt
ltadditionalMetadatagt
lt/additionalMetadatagt lt/emlgt
complexType
37Agenda
- Goals of EML
- Module Organization
- Practical Usage
- Specific Features
- Development, Documentation and Communication
38Practical Usage
- EML is both broad and deep, mostly optional
- Choosing sections to utilize depends on the
application - LTER Network is internally looking at best
practices for providing metadata levels across
their sites
39Practical Usage
- EML is both broad and deep, with most fields
optional - Choosing sections to utilize depends on the
application - LTER Network is internally looking at best
practices for providing metadata levels across
their sites
L1
L2
L3
L4
L5
L6
40Practical Usage
- EML is both broad and deep, with most fields
optional - Choosing sections to utilize depends on the
application - LTER Network is internally looking at best
practices for providing metadata levels across
their sites
T1
T2
T3
T4
T5
T6
During the next session, well explore EML using
the above organization
41Agenda
- Goals of EML
- Module Organization
- Understanding the structure
- Practical Usage
- Development, Documentation and Communication
42EML Development Communication
- Open Source project, welcomes contributions
- Developed by members of the community
- eml-dev_at_ecoinformatics.org
Pete Taylor photo
- irc.ecoinformatics.org, eml channel for
discussion - Source code managed in cvs.ecoinformatics.org
repository - Documented specification found on the KNB website
- An EML validating service is available at
- http//knb.ecoinformatics.org/emlparser
43EML Distribution
- Currently at Release 2.0.1
- Releases of EML are downloadable
- http//knb.ecoinformatics.org/software/eml
- Development version available at
- http//cvs.ecoinformatics.org/cvs/cvsweb.cgi/e
ml